Development and Validation of a Machine Learning–Based Screening Algorithm to Predict High-Risk Hepatitis C Infection - Scorecard - MDSpire

Development and Validation of a Machine Learning–Based Screening Algorithm to Predict High-Risk Hepatitis C Infection

  • By

  • Suk-Chan Jang

  • Wei-Hsuan Lo-Ciganic

  • Pilar Hernandez-Con

  • Chanakan Jenjai

  • James Huang

  • Ashley Stultz

  • Shunhua Yan

  • Debbie L Wilson

  • Ashley Norse

  • Faheem W Guirgis

  • Robert L Cook

  • Christine Gage

  • Khoa A Nguyen

  • Patrick Hornes

  • Yonghui Wu

  • David R Nelson

  • Haesuk Park

  • August 15, 2025

  • 0 min

Share

Clinical Scorecard: Machine Learning Algorithm for Identifying High-Risk Individuals for Hepatitis C Infection

At a Glance

CategoryDetail
ConditionHepatitis C virus (HCV) infection
Key MechanismsAsymptomatic infection leading to undiagnosed cases; transmission linked to opioid epidemic and injection drug use; machine learning models using EHR data to predict infection risk
Target PopulationAdults aged 18–79 years tested for HCV antibodies, RNA, or genotype
Care SettingClinical settings utilizing electronic health records for targeted screening

Key Highlights

  • HCV infection rates have increased amid the opioid epidemic, with one-third of infected individuals unaware due to asymptomatic nature.
  • A gradient boosting machine (GBM) model using 275 sociodemographic and clinical features achieved high predictive accuracy (C statistic 0.916) for HCV infection.
  • The ML algorithm identified 75.63% of HCV cases in the highest risk decile, enabling efficient targeted screening with one positive case per six tests.

Guideline-Based Recommendations

Diagnosis

  • Use machine learning algorithms applied to EHR data to identify individuals at high risk for HCV infection.
  • Screen adults aged 18 years and older for HCV antibodies, RNA, or genotype per CDC recommendations.

Management

  • Prioritize targeted screening for individuals stratified as high risk by ML models to optimize resource use.
  • Implement direct-acting antiviral therapies for confirmed HCV cases to achieve sustained virologic response.

Monitoring & Follow-up

  • Monitor risk stratification performance and update ML models with new data to maintain predictive accuracy.
  • Follow up on patients identified as high risk to ensure timely diagnostic testing and treatment initiation.

Risks

  • Undiagnosed HCV infection contributes to ongoing transmission and liver-related morbidity and mortality.
  • Universal screening may lead to unnecessary testing and resource burden without targeted approaches.

Patient & Prescribing Data

Adults tested for HCV infection in diverse clinical settings across Florida

ML-based risk stratification can enhance identification of patients needing confirmatory testing and antiviral treatment, improving care efficiency and outcomes.

Clinical Best Practices

  • Incorporate ML algorithms into clinical workflows to identify high-risk patients without disrupting care.
  • Use a 6-month window of sociodemographic and clinical data prior to testing to inform risk prediction.
  • Validate and compare multiple ML models to select the most accurate tool for HCV risk prediction.
  • Address clinician barriers to discussing sensitive risk factors by leveraging data-driven screening tools.

References

Original Source(s)

Related Content