Evaluating large language model-generated brain MRI protocols: performance of GPT4o, o3-mini, DeepSeek-R1 and Qwen2.5-72B - Takeaways - MDSpire

Evaluating large language model-generated brain MRI protocols: performance of GPT4o, o3-mini, DeepSeek-R1 and Qwen2.5-72B

  • By

  • Su Hwan Kim

  • Severin Schramm

  • Lena Schmitzer

  • Kerem Serguen

  • Sebastian Ziegelmayer

  • Felix Busch

  • Alexander Komenda

  • Marcus R. Makowski

  • Lisa C. Adams

  • Keno K. Bressem

  • Claus Zimmer

  • Jan Kirschke

  • Benedikt Wiestler

  • Dennis Hedderich

  • Tom Finck

  • Jannis Bodden

  • September 3, 2025

  • 0 min

Share

  • 1

    The study evaluates the performance of four large language models in generating brain MRI protocols based on realistic clinical cases.

  • 2

    Protocoling for brain MRI is crucial, balancing comprehensive imaging with efficiency to minimize repeat examinations and healthcare costs.

  • 3

    Two experienced neuroradiologists defined reference protocols for 150 fictitious cases, resolving discrepancies through consensus.

  • 4

    The models tested include GPT-4o, o3-mini, DeepSeek-R1, and Qwen2.5-72B, with evaluations conducted under enhanced and base conditions.

  • 5

    The study aims to assess AI's potential in improving MRI protocoling, addressing the increasing demands on radiologists' time.

Original Source(s)

Related Content