To develop an interpretable machine learning model for refining risk stratification in breast cancer recurrence prediction.
Approach:
Key Findings:
The XGBoost model achieved an AUC of 0.877 in the internal validation cohort, outperforming logistic regression (AUC = 0.693) and the TNM system.
SHAP analysis identified the Ki-67 index and positive lymph nodes as the most influential predictors.
The model effectively stratified patients within TNM Stages II and III into distinct high- and low-risk groups.
Interpretation:
The XGBoost-based framework provides a robust and interpretable tool for predicting 5-year recurrence, offering superior prognostic accuracy over standard anatomical staging.
Limitations:
The study is retrospective and conducted at a single institution, which may limit generalizability.
The model's performance needs to be validated in external cohorts.
Conclusion:
The proposed model enhances risk stratification for breast cancer recurrence, facilitating personalized clinical decision-making.