Multidisciplinary prediction of running-related injuries using machine learning - Report - MDSpire

Multidisciplinary prediction of running-related injuries using machine learning

  • By

  • Han Wu

  • Katherine Brooke-Wavell

  • Michael R. Barnes

  • Zainab Awan

  • Sarabjit Mastana

  • Sam Allen

  • Richard C. Blagrove

  • February 6, 2026

  • 0 min

Share

Integrative Machine Learning for Predicting Running-Related Injuries

Overview

This study developed a machine learning (ML) framework using multidisciplinary risk factors to predict running-related injuries (RRIs) in competitive endurance runners. Random forest models achieved the best predictive performance (AUC ~0.78), demonstrating moderate improvement over previous approaches and highlighting the value of integrating diverse data types for individualized injury risk prediction.

Background

Running-related injuries (RRIs) are multifactorial and pose significant health and economic burdens for endurance athletes. Traditional prediction models have often focused on limited risk factors, lacking integration of genetic, biomechanical, nutritional, and training data. Machine learning offers a promising approach to handle complex, multidimensional data for personalized injury risk assessment. This study prospectively monitored 142 competitive runners over 12 months, collecting weekly data across multiple domains to develop and evaluate ML models for RRI prediction.

Data Highlights

ParameterValue
Number of runners142
Weekly samples collected6181
Monitoring duration12 months
Best model AUC (Random Forest)0.781 ± 0.016 to 0.784 ± 0.014
Significance level for improved performanceq < 0.05

Key Findings

  • Integration of multidisciplinary risk factors including genetics, biomechanics, nutrition, and training data enabled improved RRI prediction.
  • Random forest models outperformed other ML algorithms with an AUC around 0.78, indicating moderate predictive accuracy.
  • Logistic regression showed significant performance gains when trained on a broader range of risk factors compared to high-quality subsets.
  • The study provides a reproducible ML framework and a valuable dataset for future large-scale injury prediction research.
  • Comparative analysis of ML methods revealed important interactions between data structure and model suitability for RRI prediction.

Clinical Implications

Clinicians and sports scientists can leverage integrative ML models incorporating diverse risk factors to better identify athletes at risk of RRIs. This approach supports personalized injury prevention strategies and targeted interventions. The reproducible framework and dataset facilitate ongoing refinement and validation of predictive tools in endurance running populations.

Conclusion

This study demonstrates that machine learning models integrating multidisciplinary risk factors can moderately improve the prediction of running-related injuries. The findings support the potential of data-driven, individualized injury risk assessment in competitive endurance runners.

References

  1. Kakouris et al. 2021 -- A systematic review of running-related musculoskeletal injuries in runners
  2. Hespanhol Junior et al. 2016 -- Health and economic burden of running-related injuries in runners training for an event
  3. Winter et al. 2020 -- A multifactorial approach to overuse running injuries: a 1-year prospective study
  4. Correia et al. 2024 -- Risk factors for running-related injuries: an umbrella systematic review
  5. Leckey et al. 2025 -- Machine learning approaches to injury risk prediction in sport: a scoping review with evidence synthesis
  6. Lövdal et al. 2021 -- Injury prediction in competitive runners with machine learning

Original Source(s)

Related Content