One Step Closer to Real-Time Detection of Missed Opportunities for Diagnosis in the ED Using LLMs - Report - MDSpire

One Step Closer to Real-Time Detection of Missed Opportunities for Diagnosis in the ED Using LLMs

  • By

  • Fernanda Bellolio

  • Daniel Cabrera

  • June 29, 2026

  • 0 min

Share

Clinical Report: Advancing Real-Time Identification of Diagnostic Oversights in the ED

Overview

This study evaluates the use of large language models (LLMs) to identify missed diagnostic opportunities in the emergency department (ED), finding a prevalence of 13.5% among analyzed encounters. The models demonstrated varying sensitivity and specificity.

Background

Identifying diagnostic oversights in the ED is crucial for improving patient outcomes, as traditional methods rely on retrospective reviews that can be time-consuming and inefficient. Automated tools like electronic triggers (eTriggers) have been proposed to enhance this process.

Data Highlights

ModelAUC (72-hour return)AUC (floor-to-ICU)
Claude Sonnet 40.650.57
Claude Sonnet 4.6
Claude Opus 4.6
Gemini 3 Pro
GPT-5
GPT-5mini0.730.82

Key Findings

  • The overall prevalence of missed opportunities for diagnosis was 13.5% among 288 encounters.
  • The number needed to screen was 9.1 for 72-hour return and 5.4 for floor-to-ICU cohorts.
  • Model discrimination AUCs ranged from 0.65 to 0.73 for 72-hour return and 0.57 to 0.82 for floor-to-ICU cohorts.
  • Claude Sonnet 4 favored higher sensitivity, while GPT-5mini favored higher specificity in binary classifications.
  • Physician interrater agreement was 81.9%.
  • LLMs can analyze unstructured clinical notes to detect missed diagnostic opportunities.

Clinical Implications

The findings suggest that LLMs can enhance the identification of missed diagnostic opportunities in real-time, potentially improving patient safety. The choice of model based on sensitivity and specificity trade-offs is critical for optimizing the review process in clinical settings.

Conclusion

The study indicates that LLMs can identify missed diagnostic opportunities in emergency medicine.

Related Resources & Content

  1. Marks et al., JMIR Medical Informatics, 2026 -- Advancing Real-Time Identification of Diagnostic Oversights in the ED
  2. npj Digital Medicine — Collaboration Between Humans and Large Language Models in Clinical Practice: A Systematic Review and Meta-Analysis
  3. Journal of Medical Internet Research (JMIR) — Automated Identification of Nursing Diagnoses and Interventions From Nursing Records Using a Retrieval-Augmented Large Language Model Approach: Quantitative Study
  4. npj Digital Medicine — Uncertainty-aware large language models for explainable disease diagnosis
  5. Implementation of electronic triggers to identify diagnostic errors in emergency departments.
  6. 3. Results | Agency for Healthcare Research and Quality
  7. Diagnostic accuracy of large language models for emergency department triage: a systematic review and meta-analysis | BMC Emergency Medicine | Springer Nature Link

Original Source(s)

Related Content