Clinically-guided models or foundation models? predicting cervical spondylotic myelopathy from electronic health records

By
Salim Yakdan
Ben Warner
Zoher Ghogawala
Wilson Z. Ray
Mohamad Bydon
Michael P. Steinmetz
Richard T. Griffey
Randi Foraker
Adam Wilcox
Chenyang Lu
Jacob K. Greenberg
January 20, 2026
0 min

Npj Digital Medicine

Overview

This study developed and externally validated machine learning models to predict cervical spondylotic myelopathy (CSM) onset up to 30 months prior to diagnosis using large-scale electronic health records (EHRs). Foundation models demonstrated superior predictive performance at most time horizons compared to simpler clinically guided models, with area under the precision-recall curve (AUPRC) improvements of up to nearly 7-fold over non-informative classifiers.

Background

Cervical spondylotic myelopathy (CSM) is the leading cause of spinal cord dysfunction in older adults, characterized by progressive neurological deficits due to degenerative cervical spinal stenosis. Early diagnosis is critical to prevent irreversible damage, but delays of two to six years are common, often due to limited clinical recognition and imaging access. Electronic health records (EHRs) offer a rich data source for predictive modeling, and recent advances in foundation models present an opportunity to improve early CSM detection across diverse healthcare settings.

Data Highlights

Dataset	Controls	CSM Cases	Average Age (CSM)	Clinical Encounters (CSM)
Merative	1,442,104	34,106	Older than controls	Higher than controls
BJC	497,510	13,200	Older than controls	Higher than controls

Model performance (AUPRC) ranged from 0.12 to 0.163 across prediction horizons, representing a 5.07- to 6.9-fold improvement over baseline prevalence (0.0236). The clmbr-t-5k-csm foundation model outperformed others at most time points except at 24 and 30 months, where simpler models excelled.

Key Findings

The clmbr-t-5k-csm foundation model achieved superior AUPRC at 6, 12, and 18 months prior to diagnosis compared to other models.
Simple-mamba and simple-ff models showed better performance at 24 and 30 months prediction horizons, respectively.
Foundation models CEHBERT and CoreBEHRT underperformed relative to clinically guided and other foundation models at all tested horizons.
Patients with CSM were older and had more clinical encounters, including orthopedic and neurosurgical visits, than controls.
Model AUPRC values ranged from 0.12 to 0.163, indicating meaningful predictive ability well above baseline prevalence.
Individualized prediction explanations were generated, highlighting relevant clinical features contributing to model predictions.

Clinical Implications

Machine learning models, particularly foundation models trained on large EHR datasets, can enhance early prediction of CSM, enabling timely diagnostic workups and interventions. Incorporating such models into clinical decision support systems may help reduce diagnostic delays and improve patient outcomes by identifying at-risk individuals before irreversible neurological damage occurs.

Conclusion

Foundation models applied to large-scale EHR data demonstrate promising performance for early prediction of cervical spondylotic myelopathy across clinically actionable time horizons. These findings support further development and integration of AI-driven tools to facilitate earlier diagnosis and treatment of CSM.

References

Article Source 2024 -- Comparing Clinically-Guided and Foundation Models for Predicting Cervical Spondylotic Myelopathy Using Electronic Health Records