Clinical Report: Limitations of AI in Clinical Forecasting
Background
The integration of AI-driven predictive tools in electronic health records (EHRs) is becoming increasingly common in clinical practice. However, the accuracy and reliability of these tools remain uncertain. Understanding the limitations of these models is crucial for clinicians who rely on them for decision-making.
Data Highlights
Model
Vendor AUROC
Pooled AUROC
Sepsis Model
0.77
0.62
End-of-Life Care Index
0.89
0.76
Patient No-Show Model
0.77
0.62
Unplanned Readmission Model
0.74
0.70
Deterioration Index
0.80
0.79
Key Findings
The pooled AUROC estimates for predictive models were consistently lower than Epic's reported benchmarks.
For sepsis, readmission, and end-of-life models, the 95% confidence intervals around pooled estimates did not overlap with Epic's benchmarks.
Every model exhibited high heterogeneity, indicating performance variability across healthcare settings.
Data leakage and model drift are significant factors contributing to the degradation of model performance post-deployment.
Clinicians face ethical uncertainties regarding the reliance on AI outputs for patient care decisions.
Clinical Implications
Clinicians should be aware of the discrepancies between model performance in development and real-world application. Continuous evaluation and validation of these tools are necessary.
Conclusion
The findings reveal gaps in the predictive capabilities of AI tools in clinical practice, highlighting the need for further investigation into their effectiveness and the factors influencing model performance.
In a small open-label randomized trial, 2 platelet-rich plasma injections were associated with greater 6-month improvements in pain and function than corticosteroid injection or oral aceclofenac among patients awaiting knee arthroplasty.