Development and validation of a machine learning model for sperm DNA fragmentation rate in infertile men: a multicenter retrospective study - Report - MDSpire

Development and validation of a machine learning model for sperm DNA fragmentation rate in infertile men: a multicenter retrospective study

  • By

  • Ke Wang

  • Jinxia Zheng

  • Xuanxuan Ge

  • Jie Bai

  • Mengmeng Ma

  • Ningxin Qin

  • Xin Huang

  • Hui Jiang

  • You Zhang

  • June 22, 2026

  • 0 min

Share

Clinical Report: Machine Learning Model for Predicting Sperm DNA Fragmentation

Overview

This study developed a machine learning-based model to predict sperm DNA fragmentation index (DFI) in infertile men, utilizing clinical and semen parameters. The Random Forest model demonstrated the highest predictive accuracy, but showed miscalibration in external validation.

Background

Infertility is a significant global health issue, with male factors contributing to a substantial proportion of cases. The sperm DNA fragmentation index (DFI) is increasingly recognized as a critical measure of male fertility potential, particularly in the context of assisted reproductive technology (ART). Accurate assessment of sperm quality, including DFI, is essential for improving reproductive outcomes in infertile men.

Data Highlights

ModelDevelopment Cohort AUCExternal Validation AUC
Random Forest0.979 (0.972−0.986)0.945 (95% CI: 0.916−0.975)

Key Findings

  • The study included 1,037 patients in the development cohort and 290 in the external validation cohort.
  • The Random Forest model outperformed other machine learning models in predicting DFI.
  • Core factors influencing DFI included sperm motility, concentration, viability, lifestyle factors, and stress levels.
  • The model exhibited miscalibration in external validation, indicating systematic overestimation of risk.
  • An online prediction platform was developed for practical use based on the model.

Clinical Implications

Further validation of the machine learning model is necessary before clinical implementation.

Conclusion

The machine learning model requires additional validation.

Related Resources & Content

  1. Frontiers in Endocrinology, 2026 -- Association of semen leukocytes with sperm DNA fragmentation in a clinical cohort
  2. Frontiers in Endocrinology, 2026 -- Development and internal validation of a post-retrieval machine learning models for OHSS risk stratification in assisted reproductive technology: an exploratory study
  3. Frontiers in Reproductive Health, 2026 -- Development of Machine Learning-Based Predictive Models for Fertility Intentions in Patients with Crohn's Disease
  4. WHO laboratory manual for the examination and processing of human semen, 6th ed
  5. Diagnosis and treatment of infertility in men: AUA/ASRM guideline part I (2020) | American Society for Reproductive Medicine | ASRM
  6. Frontiers in Endocrinology — Development and validation of a clinical prediction model for poor ovarian response in assisted reproductive technology
  7. WHO laboratory manual for the examination and processing of human semen, 6th ed
  8. Diagnosis and treatment of infertility in men: AUA/ASRM guideline part I (2020) | American Society for Reproductive Medicine | ASRM
  9. Sperm DNA fragmentation and assisted reproduction: an umbrella meta-analysis - PMC

Original Source(s)

Related Content