To develop a deep learning framework for automated grading of knee osteoarthritis (KOA) using the Kellgren–Lawrence (KL) system, specifically addressing limitations such as reliance on single-scale features and poor generalization to external datasets.
Approach:
Key Findings:
The model achieved superior performance on internal validation (F1: 0.726, precision: 0.740, MCC: 0.620, accuracy: 0.726), indicating its effectiveness in accurately grading KOA.
On external validation, the model maintained robust generalization (F1: 0.656, precision: 0.683, MCC: 0.564, accuracy: 0.685) with only a 4.04% drop in accuracy, suggesting clinical reliability.
Misclassifications were mainly confined to adjacent KL grades, with no extreme errors observed, highlighting the model's precision.
Interpretation:
The multi-scale attention-guided framework enables reliable, automated KL grading of KOA from radiographs, addressing key limitations of prior models.
Limitations:
The study may be limited by the quality and diversity of training data, which could affect the model's performance in real-world applications.
Performance may vary based on external dataset characteristics, potentially impacting generalizability.
Conclusion:
The proposed framework supports standardized, objective, and interpretable clinical assessment of KOA severity.
Nearly 40% of registry patients would have been excluded from phase 3 randomized controlled trials, with exclusion criteria distributed unevenly across drug classes.