Automated identification of fall-related injuries in unstructured clinical notes

By
Wendong Ge
Lilian M Godeiro Coelho
Maria A Donahue
Hunter J Rice
Deborah Blacker
John Hsu
Joseph P Newhouse
Sonia Hernández-Díaz
Sebastien Haneuse
Brandon Westover
Lidia M V R Moura
July 26, 2024
0 min

American Journal Of Epidemiology

Overview

This study developed and validated natural language processing (NLP) models to accurately identify fall-related injuries (FRIs) from unstructured clinical documentation in older adults. Among five models tested, RoBERTa demonstrated superior performance with high precision, recall, and F1 scores, highlighting its potential to enhance large-scale clinical research on FRIs.

Background

Fall-related injuries are a leading cause of hospitalization and emergency visits among adults aged 65 and older, incurring substantial healthcare costs and impacting patient independence. Manual review of electronic health records (EHRs) to identify FRIs is labor-intensive and prone to error, motivating the use of automated natural language processing techniques. Previous approaches using support vector machines have been supplemented by advanced transformer-based models like BERT, which have shown promise in detecting various medical conditions from unstructured text. This study aimed to leverage these advancements to improve FRI identification in a large healthcare system's EHR data.

Data Highlights

Metric	RoBERTa Performance	95% Confidence Interval
Precision	0.90	0.88 - 0.91
Recall	0.91	0.90 - 0.93
F1 Score	0.91	0.89 - 0.92
AUROC	0.96	0.95 - 0.97
AUPR	0.96	0.95 - 0.97

Key Findings

RoBERTa outperformed other NLP models including vanilla BERT, ClinicalBERT, DistilBERT, and SVM in detecting FRIs from clinical notes.
The model achieved high precision (0.90) and recall (0.91), indicating accurate and comprehensive identification of FRIs.
Training involved a three-stage process: masked language modeling, general boolean question-answering, and FRI-specific question-answering.
The study utilized a large dataset of 154,949 paragraphs containing FRI-related keywords from 1,669 patients aged 65 and older.
Expert manual labeling of 5,000 paragraphs and validated pattern annotations enabled robust benchmark and validated-standard labels for model training and testing.

Clinical Implications

The implementation of advanced NLP models like RoBERTa can significantly streamline the identification of fall-related injuries in unstructured clinical documentation, reducing reliance on manual chart review. This automation facilitates more efficient and accurate large-scale epidemiological studies and quality improvement initiatives targeting fall prevention in older adults. Clinicians and researchers may leverage these tools to better monitor and address FRIs within healthcare systems.

Conclusion

RoBERTa-based NLP models provide a reliable and efficient method for detecting fall-related injuries in unstructured clinical notes, offering a valuable tool to enhance clinical research and potentially improve patient care outcomes related to falls in older adults.

References

Mass General Brigham Study 2024 -- Automated Detection of Fall-Associated Injuries in Unstructured Clinical Documentation

Automated identification of fall-related injuries in unstructured clinical notes

Automated NLP Detection of Fall-Related Injuries in Clinical Notes

Overview

Background

Data Highlights

Key Findings

Clinical Implications

Conclusion

References

Original Source(s)

Automated identification of fall-related injuries in unstructured clinical notes

Related Content

Impact of the COVID-19 Pandemic on Behavioral Symptoms in Dementia and the Use of Psychotropic Medications

Preop Cognitive Impairment Tied to TKA Delirium

AI Scribes Lag Clinicians on Note Quality