Multidisciplinary prediction of running-related injuries using machine learning
By
Han Wu
Katherine Brooke-Wavell
Michael R. Barnes
Zainab Awan
Sarabjit Mastana
Sam Allen
Richard C. Blagrove
February 6, 2026
Clinical Scorecard: Integrative Machine Learning Approaches for Predicting Running-Related Injuries
At a Glance
Category Detail
Condition Endurance running-related injuries (RRI)
Key Mechanisms Multifactorial risk factors including genetics, history, muscular strength, biomechanics, body composition, nutrition, and training
Target Population Competitive endurance runners
Care Setting Sports medicine and injury prevention monitoring
Key Highlights
Development of a machine learning-ready weekly RRI prediction dataset from 142 competitive endurance runners monitored over 12 months. Random forest models achieved the best predictive performance (AUC ~0.78), outperforming most other algorithms. Inclusion of a broader range of risk factors improved logistic regression model performance significantly.
Guideline-Based Recommendations
Diagnosis
Utilize multidisciplinary risk factor assessment including genetic, biomechanical, and training data for individualized RRI risk evaluation.
Management
Incorporate machine learning models, particularly random forest algorithms, to predict injury risk and guide preventive interventions.
Monitoring & Follow-up
Prospective weekly monitoring of runners’ risk factors and injury status to enable timely prediction and management.
Risks
Recognize the multifactorial nature of RRIs requiring comprehensive data collection to improve prediction accuracy.
Patient & Prescribing Data
Competitive endurance runners monitored weekly over 12 months
Machine learning models can stratify injury risk to inform personalized training adjustments and injury prevention strategies.
Clinical Best Practices
Collect and integrate high-quality multidisciplinary risk factors for accurate injury risk prediction. Apply random forest machine learning models for superior predictive performance in RRI risk assessment. Use broader risk factor datasets to enhance logistic regression model accuracy where applicable. Maintain prospective and continuous monitoring to capture dynamic injury risk changes.
References