Enhancing clinically cardiovascular machine learning model for risk prediction via sample augmentation

By
Xiaoyu Tang
Min Tang
Wu Liu
Shaoyang Cui
June 9, 2026
0 min

Frontiers In Medicine

Objective:

To evaluate the value of moderate data augmentation for cardiovascular risk modeling and propose an interpretable and deployable solution within a clear framework of transitioning from continuous risk assessment to thresholding.

Key Findings:

2 × augmentation achieved a favorable compromise between error reduction (lower MAE and RMSE) and goodness of fit (higher R2).
The Random Forest (RF) model achieved an accuracy of 94.0%, F2 of 94.4%, sensitivity of 95.9%, and specificity of 91.8% after thresholding.
Key driving factors identified include oldpeak, num major vessels, chest pain type, thal, exang, and max hr.

Interpretation:

Moderate data augmentation (preferably 2×) can significantly enhance robustness in small sample settings; RF strikes a favorable balance between accuracy, stability, and interpretability.

Limitations:

The study is limited to a specific heart disease classification dataset, which may not represent other conditions.
Results may not be generalizable to other datasets or clinical scenarios, limiting broader applicability.

Conclusion:

This study offers a reusable multiplication guidance and risk stratification scheme, providing a methodological foundation for deploying interpretable cardiovascular risk models effectively.

Enhancing clinically cardiovascular machine learning model for risk prediction via sample augmentation

Objective:

Key Findings:

Interpretation:

Limitations:

Conclusion:

Original Source(s)

Enhancing clinically cardiovascular machine learning model for risk prediction via sample augmentation

Related Content

Clinical analysis of drug treatment trends for coronary heart disease in China

Development and validation of a nomogram model for predicting cardiac autonomic neuropathy in patients with diabetes

Effects of SGLT2 inhibition on incident heart failure in carriers of cardiomyopathy-associated genetic variants