Forecasting Anti-CCP Positivity and Early Onset of Rheumatoid Arthritis
Overview
This study evaluates the predictive capabilities of machine learning models using routine laboratory data to forecast anti-CCP positivity and early rheumatoid arthritis (RA) onset. Logistic regression achieved an accuracy of 84.8% and an AUC of 0.857 based on the study results.
Background
Rheumatoid arthritis (RA) is a prevalent autoimmune disease that can lead to significant joint damage and disability if not diagnosed early. Traditional diagnostic methods often rely on single autoantibody markers, which may overlook many patients. This study explores the potential of machine learning to integrate multiple laboratory parameters for improved early detection of RA.
Data Highlights
Model
Accuracy
AUC
F1 Score
MCC
Logistic Regression
0.848
0.857
0.910
0.441
Transformer
N/A
0.812
N/A
N/A
Key Findings
Logistic regression achieved the best overall performance metrics for predicting early RA.
Anti-CCP positivity and early RA onset were defined as dual binary prediction targets.
ESR and CRP were identified as key positive predictive drivers, while albumin served as a protective factor.
The SHAP analysis provided mechanistic interpretability, revealing interactions between laboratory parameters.
Dependence plots indicated a non-linear protective threshold effect of albumin.
Clinical Implications
The findings indicate that integrating routine laboratory parameters with machine learning can enhance early RA risk prediction.
Conclusion
The study demonstrates the feasibility of using machine learning to predict early RA onset and anti-CCP positivity.