Can AI Extract Echo Report Data as Accurately as Expert Annotation? - Summary - MDSpire
Advertisement
Can AI Extract Echo Report Data as Accurately as Expert Annotation?
In a study of 50 echocardiography reports, GPT-5 mini extracted 55 cardiovascular fields from free-text echocardiography reports with 92.5% exact-match agreement with expert annotation.
To evaluate the accuracy of a large language model (GPT-5 mini) in extracting structured cardiovascular data from free-text echocardiography reports.
Approach:
Study Design: The study involved extracting 55 cardiovascular fields from de-identified reports in the MIMIC-III EchoNotes dataset, comparing model outputs to expert annotations.
Data Extraction: Fifty reports were annotated by a board-certified echocardiographer and extracted by GPT-5 mini, with a blinded cardiologist adjudicating discrepancies.
Key Findings:
The large language model achieved 92.5% exact-match agreement with expert annotation, with precision ranging from 96% to 98% across categories and recall ranging from 85% to 95%.
The model identified 120 additional clinical values not documented by human annotators, reflecting both over-extraction of normal findings and human annotation errors.
Interpretation:
The model showed strong performance in extracting echocardiography data, but over-extraction of normal findings was noted as a potential issue.
Limitations:
The study did not report on prospective clinical use, diagnostic accuracy, patient outcomes, or workflow-safety outcomes.
Performance varied across exam types, particularly with lower information density in stress echocardiograms.
Conclusion:
Large language models demonstrated strong capability for automated echocardiography data extraction.