Evaluating large language model-generated brain MRI protocols: performance of GPT4o, o3-mini, DeepSeek-R1 and Qwen2.5-72B

By
Su Hwan Kim
Severin Schramm
Lena Schmitzer
Kerem Serguen
Sebastian Ziegelmayer
Felix Busch
Alexander Komenda
Marcus R. Makowski
Lisa C. Adams
Keno K. Bressem
Claus Zimmer
Jan Kirschke
Benedikt Wiestler
Dennis Hedderich
Tom Finck
Jannis Bodden
September 3, 2025
0 min

European Radiology

Objective:

To evaluate the ability of large language models (LLMs) to suggest granular, sequence-level brain MRI protocols based on realistic clinical cases, addressing current challenges in MRI protocoling.

Key Findings:

LLMs can generate MRI protocols that align with expert-defined protocols, with varying performance based on context.
Inter-rater agreement among radiologists was assessed using Cohen’s kappa, indicating reliability.
The performance of LLMs varied based on the inclusion of additional context, highlighting the importance of contextual information.

Interpretation:

The study suggests that LLMs have the potential to assist in protocoling MRI scans, potentially reducing radiologist workload and improving efficiency, which is crucial in the face of increasing demand for MRI services.

Limitations:

The study used fictitious cases, which may not fully represent real-world complexities, potentially limiting the applicability of the findings.
Inter-rater reliability may not reflect broader clinical practice variations, suggesting a need for further validation.

Conclusion:

LLMs show promise in generating MRI protocols, which could enhance clinical workflows and reduce errors in protocoling, but further research is needed to validate these findings in real-world settings.

Evaluating large language model-generated brain MRI protocols: performance of GPT4o, o3-mini, DeepSeek-R1 and Qwen2.5-72B

Objective:

Key Findings:

Interpretation:

Limitations:

Conclusion:

Original Source(s)

Evaluating large language model-generated brain MRI protocols: performance of GPT4o, o3-mini, DeepSeek-R1 and Qwen2.5-72B

Related Content

Correction: MRI-based habitat radiomics for predicting WHO/ISUP nuclear grade in clear cell renal cell carcinoma

Diagnosing and Treating Ocular Lymphoma

First Patient Treated with Proton Therapy at Lynn Cancer Institute