An XGBoost-based model for detecting undiagnosed type 2 diabetes using routine physical and lifestyle data from a multi-center Chinese population - Report - MDSpire

An XGBoost-based model for detecting undiagnosed type 2 diabetes using routine physical and lifestyle data from a multi-center Chinese population

  • By

  • Hui Xiao

  • Qian Xi

  • Ping Zeng

  • Jinjuan Hao

  • Qinghua He

  • Xiaoxia Wang

  • Chi Zhang

  • June 24, 2026

  • 0 min

Share

Clinical Report: Machine Learning for Identifying Undiagnosed Type 2 Diabetes

Overview

This study developed and validated an XGBoost model to identify undiagnosed type 2 diabetes (T2D) using routine health checkup data from a multi-center cohort in China. The model achieved a moderate predictive performance with an AUC of 77.2%.

Background

Diabetes, particularly type 2 diabetes (T2D), is a growing global health concern, with projections indicating a significant rise in prevalence. Early diagnosis is crucial for improving patient outcomes. This study addresses the gap in utilizing routine health checkup data for real-time risk assessment of undiagnosed T2D.

Data Highlights

PredictorInfluence (%)
Fasting blood glucose50.6
Creatinine6.6
Triglyceride5.6
Age5.1
Low-density lipoprotein5.0

Key Findings

  • The XGBoost model was developed using data from 11,382 individuals.
  • The model was validated on an independent test set of 1,026 individuals.
  • The area under the receiver operating characteristic curve (AUC) for the model was 77.2% (95% CI: 70.3%–84.1%).
  • Fasting blood glucose was identified as the most influential predictor of undiagnosed T2D.
  • The model includes 12 predictors related to metabolic and inflammatory markers.

Clinical Implications

The XGBoost model can assist clinicians in identifying individuals at high risk for undiagnosed T2D during routine health examinations.

Conclusion

The study demonstrates the feasibility of using machine learning to enhance the identification of undiagnosed T2D through routine health data.

Related Resources & Content

  1. Author(s)/Org, Source, Year -- Title
  2. AI model identifies CKD risk in older adults, AACE Endocrine AI, 2026 -- AI model identifies CKD risk in older adults
  3. Machine learning tool may help personalize type 2 diabetes treatment, AACE Endocrine AI, 2026 -- Machine learning tool may help personalize type 2 diabetes treatment
  4. Diagnosis and Classification of Diabetes: Standards of Care in Diabetes—2026, Diabetes Care, American Diabetes Association -- Diagnosis and Classification of Diabetes: Standards of Care in Diabetes—2026
  5. Derivation and Validation of D-RISK: An Electronic Health Record–Driven Risk Score to Detect Undiagnosed Dysglycemia in Clinical Practice, Diabetes Care, American Diabetes Association -- Derivation and Validation of D-RISK
  6. Obesity Surgery — Utilizing Advanced Machine Learning to Forecast Metabolic Dysfunction–Associated Steatotic Liver Disease in the Han Chinese Population
  7. Guidelines on primary healthcare for type 2 diabetes in China, 2025
  8. 2. Diagnosis and Classification of Diabetes: Standards of Care in Diabetes—2026 | Diabetes Care | American Diabetes Association
  9. Derivation and Validation of D-RISK: An Electronic Health Record–Driven Risk Score to Detect Undiagnosed Dysglycemia in Clinical Practice | Diabetes Care | American Diabetes Association

Original Source(s)

Related Content