Fair positive unlabeled learning for predicting undiagnosed Alzheimer’s disease in diverse electronic health records - Summary - MDSpire

Fair positive unlabeled learning for predicting undiagnosed Alzheimer’s disease in diverse electronic health records

  • By

  • Thai Tran

  • Mingzhou Fu

  • Jessica Fung

  • Sriram Sankararaman

  • David A. Elashoff

  • Keith Vossel

  • Timothy S. Chang

  • November 27, 2025

  • 0 min

Share

Objective:

To improve the prediction of undiagnosed Alzheimer's Disease (AD) using semi-supervised positive unlabeled learning (SSPUL), a method that learns from both labeled and unlabeled data, while addressing racial bias in diverse populations.

Key Findings:
  • SSPUL achieved sensitivity of 0.77–0.81 and AUCPR of 0.81–0.87, outperforming supervised baseline models (sensitivity: 0.39–0.53; AUCPR: 0.3–0.7).
  • SSPUL demonstrated superior fairness with the lowest cumulative parity loss.
  • Identified shared and distinct features among labeled and unlabeled AD patients, including neurological (e.g., memory loss) and non-neurological indicators (e.g., decubitus ulcer).
  • Polygenic risk scores were significantly higher in labeled and predicted positives compared to predicted negatives among non-Hispanic white, Hispanic Latino, and East Asian groups (p < 0.001).
Interpretation:

The findings suggest that SSPUL can significantly enhance the prediction of undiagnosed AD while promoting fairness across diverse racial and ethnic groups, potentially leading to improved diagnostic practices.

Limitations:
  • The study may be limited by the quality and completeness of electronic health records, which can vary widely.
  • Potential biases in the underlying data, such as historical disparities in healthcare access, may still affect the results despite mitigation efforts.
Conclusion:

SSPUL represents a promising approach to improve the equitable prediction of undiagnosed Alzheimer's Disease, addressing both sensitivity and fairness in diverse populations.

Original Source(s)

Related Content