Evaluation of large language models for diagnostic impression generation from brain MRI report findings: a multicenter benchmark and reader study - Scorecard - MDSpire
Advertisement
Evaluation of large language models for diagnostic impression generation from brain MRI report findings: a multicenter benchmark and reader study
Top three differential-diagnosis prompting strategy improved patient-level accuracy to 97.6% versus 87.1% for single-diagnosis prompting.
Integration of DeepSeek-R1 assistance improved radiologist diagnostic accuracy and reduced reading time, especially benefiting junior radiologists.
Guideline-Based Recommendations
Diagnosis
Utilize advanced large-scale LLMs like DeepSeek-R1 for automated diagnostic impression generation from brain MRI reports.
Incorporate structured report findings and relevant clinical information to optimize model performance.
Apply top three differential-diagnosis prompting strategies to enhance diagnostic accuracy.
Management
Integrate LLM assistance into radiology workflows to support report drafting and improve efficiency.
Use LLM outputs as supportive tools rather than sole diagnostic sources, maintaining radiologist oversight.
Monitoring & Follow-up
Assess diagnostic accuracy and reading time metrics when implementing LLM assistance in clinical practice.
Monitor performance differences across radiologist experience levels to tailor support accordingly.
Risks
Potential overreliance on AI-generated impressions without adequate clinical validation.
Variability in model performance depending on input data structure and clinical context.
Patient & Prescribing Data
Patients undergoing brain MRI scans with diverse brain disease categories across multiple centers
Automated diagnostic impression generation using LLMs can enhance diagnostic accuracy and workflow efficiency, potentially improving patient care through timely and accurate reporting.
Clinical Best Practices
Employ structured MRI report findings and relevant clinical data as inputs to LLMs for optimal diagnostic impression generation.
Adopt multi-diagnosis prompting strategies to capture differential diagnoses and improve accuracy.
Use LLM assistance to reduce radiologist workload and reading time, particularly supporting less experienced radiologists.
Maintain radiologist oversight to validate AI-generated impressions and ensure clinical safety.