Clinical Scorecard: A Psychometric-Informed Benchmark for Data Augmentation in Imbalanced Mental Health Surveys Among Students: Moving Beyond Classification Metrics
At a Glance
Category
Detail
Condition
Mental Health Disorders in Students
Key Mechanisms
Data augmentation strategies for improving classification utility and construct validity in mental health surveys.
Target Population
University students experiencing mental health issues.
Care Setting
Academic institutions and mental health screening environments.
Key Highlights
Benchmarking of eight data-augmentation strategies on classification utility and construct validity.
PCT-GAN showed the best performance for small, imbalanced datasets.
SMOTE variants performed well on larger datasets with neural networks.
PCT-GAN reduced inter-item correlation deviation significantly compared to other methods.
Embedding construct-validity constraints enhances the reliability of synthetic mental health data.
Guideline-Based Recommendations
Diagnosis
Utilize machine-learning-based screening tools for triaging students' mental health.
Management
Select SMOTE-family oversamplers for transient training-set augmentation.
Use PCT-GAN for synthetic data intended for sharing and re-analysis.
Monitoring & Follow-up
Assess classification performance using Macro-F1, ROC-AUC, G-mean, and MCC.
Risks
Class imbalance can lead to under-prediction of urgent minority cases.
Patient & Prescribing Data
Students with varying levels of depression, anxiety, and stress.
Machine learning can enhance the identification of students needing clinical attention.
Clinical Best Practices
Implement psychometric validity checks when using synthetic data.
Regularly evaluate the performance of machine-learning classifiers on imbalanced datasets.