Large Language Model Automated Extraction of Clinical Signs and Symptoms From Emergency Department Reports for Machine Learning Prediction Models: Development and Validation Study - Summary - MDSpire
Advertisement
Large Language Model Automated Extraction of Clinical Signs and Symptoms From Emergency Department Reports for Machine Learning Prediction Models: Development and Validation Study
To evaluate whether a small multilingual LLM (Qwen 2.5:14B) can automatically extract clinical features from Dutch ED reports and provide reliable inputs for a prediction model for acute abdominal pain (AAP), which is critical for timely and accurate patient management.
Key Findings:
High interrater agreement for manually annotated features (Krippendorff α values of 0.93 for binary features), indicating strong reliability.
LLM-based feature extraction achieved comparable accuracy to physician annotations, suggesting potential for clinical application.
The study supports the feasibility of using LLMs for scalable, privacy-preserving workflows in ED decision support, which could enhance patient outcomes.
Interpretation:
The use of a small multilingual LLM for feature extraction from ED reports is promising, demonstrating potential for effective integration into clinical workflows, which could streamline processes and improve patient care.
Limitations:
The study focused on a specific clinical use case (AAP) and may not generalize to other conditions.
Results are based on a single hospital's data, which may limit external validity.
Potential biases in data collection or annotation could affect the reliability of the findings.
Conclusion:
Automated extraction of clinical features using LLMs can enhance data usability in emergency medicine, supporting improved decision-making processes.
by Anoeska Schipper, Peter Belgers, Rory David O'Connor, Lieke van de Wouw, Luc Builtjes, Joeran S Bosma, Ron Kusters, Steef Kurstjens, Matthieu Rutten, Bram van Ginneken