To improve early prediction of rheumatoid arthritis (RA) by utilizing routine laboratory parameters and machine learning techniques, specifically targeting anti-CCP positivity and early RA onset.
Approach:
Study Design: 500 patients were enrolled, and 29 routine laboratory features were collected to predict anti-CCP positivity and early RA onset, defined as dual binary prediction targets.
Key Findings:
Logistic regression achieved the best overall performance with accuracy of 0.848, AUC of 0.857, F1 of 0.910, and MCC of 0.441.
The Transformer deep learning model performed well with an AUC of 0.812.
ESR and CRP were identified as the most important positive predictive drivers, while albumin was a key protective factor.
Interpretation:
The machine learning pipeline predicts early RA risk, and the SHAP analysis provides interpretable decision rationale.
Limitations:
Existing studies often evaluate only a single or few models, lacking systematic cross-comparison.
The 'black-box' nature of deep learning models limits clinical credibility.
RA datasets commonly exhibit class imbalance, which can affect model performance.
Conclusion:
The study demonstrates the potential of machine learning and SHAP analysis in enhancing early RA risk prediction using routine laboratory data.
A VHA study across 11 vendors finds AI-generated primary care notes score lower than clinician-written notes, with the largest deficits in thoroughness, organization, and usefulness