One Step Closer to Real-Time Detection of Missed Opportunities for Diagnosis in the ED Using LLMs - Summary - MDSpire

One Step Closer to Real-Time Detection of Missed Opportunities for Diagnosis in the ED Using LLMs

  • By

  • Fernanda Bellolio

  • Daniel Cabrera

  • June 29, 2026

  • 0 min

Share

Objective:

To evaluate the effectiveness of large language models (LLMs) in identifying missed opportunities for diagnosis in the emergency department.

Approach:
  • Study Design: The study evaluated 6 commercially available LLMs to identify missed diagnostic opportunities in 288 emergency department encounters.
Key Findings:
  • The overall prevalence of missed opportunities for diagnosis was 13.5%.
  • Area under the receiver operating characteristic curves (AUCs) ranged from 0.65 to 0.73 for 72-hour return and 0.57 to 0.82 for floor-to-ICU cohorts.
  • Models exhibited different sensitivity-specificity tradeoffs, with Claude Sonnet 4 favoring sensitivity and GPT-5mini favoring specificity.
  • Physician interrater agreement was 81.9%, indicating variability in expert reviews.
Interpretation:

The study indicates that LLMs can detect missed diagnostic opportunities in low-prevalence outcomes using unstructured clinical notes.

Limitations:
  • The study is retrospective in nature.
  • Aggregate discrimination metrics alone are insufficient to determine model appropriateness for clinical tasks.
Conclusion:

The findings indicate progress towards the deployment of LLM-based screening tools for real-time diagnostic safety in emergency medicine.

Sources:

Original Source(s)

Related Content