To develop and evaluate a weakly supervised transformer model (WEST) for diagnosing rare diseases and subphenotyping using electronic health records (EHRs).
Key Findings:
WEST outperforms existing methods in phenotype classification.
The model effectively identifies clinically relevant subphenotypes.
WEST predicts disease progression more accurately than previous approaches.
Interpretation:
By reducing reliance on manual annotation, WEST enables efficient label representation learning that supports accurate rare disease diagnosis and provides deeper clinical insights from routine EHR data.
Limitations:
The study is limited to data from Boston Children's Hospital, affecting generalizability.
Access to the EHR data is restricted due to privacy regulations.
Conclusion:
The WEST model demonstrates significant potential in improving the diagnosis and understanding of rare diseases through enhanced analysis of EHR data.