To develop and evaluate a large language model-based framework for early identification of OSAHS-related risk factors and early signals from free-text descriptions, enhancing timely screening and intervention.
Key Findings:
OSAHSrisk-LLM achieved an overall accuracy of 92.9% in the four-class text-level classification task, significantly outperforming baseline models including CNN, Text-CNN, Transformer, and BERT.
Demonstrated strong robustness under highly imbalanced class distributions and improved identification of minority categories, with a focus on precision and recall metrics.
Interpretation:
Large language models, when integrated with clinical knowledge constraints and structured reasoning strategies, can effectively extract early OSAHS-related risk information from patient narratives. This integration enhances the reliability of the extracted information.
Limitations:
Further validation against clinically confirmed OSAHS diagnoses is required to ensure the framework's accuracy in real-world settings.
Prospective screening cohorts need to be assessed before real-world clinical application to evaluate the framework's effectiveness.
Conclusion:
The proposed framework demonstrates the potential of large language models to identify textual mentions of OSAHS-related risk factors and early signals in patient-generated narratives.