Beyond classification metrics: a psychometric-aware benchmark for data augmentation in imbalanced student mental health surveys - Scorecard - MDSpire

Beyond classification metrics: a psychometric-aware benchmark for data augmentation in imbalanced student mental health surveys

  • By

  • Chen Shao

  • Shengnan Qiao

  • Ang Li

  • Shipeng Liu

  • Yanglong Chen

  • Yuzhe Tan

  • Zhibo Liang

  • Xuanming Si

  • June 15, 2026

  • 0 min

Share

Clinical Scorecard: A Psychometric-Informed Benchmark for Data Augmentation in Imbalanced Mental Health Surveys Among Students: Moving Beyond Classification Metrics

At a Glance

CategoryDetail
ConditionMental Health Disorders in Students
Key MechanismsData augmentation strategies for improving classification utility and construct validity in mental health surveys.
Target PopulationUniversity students experiencing mental health issues.
Care SettingAcademic institutions and mental health screening environments.

Key Highlights

  • Benchmarking of eight data-augmentation strategies on classification utility and construct validity.
  • PCT-GAN showed the best performance for small, imbalanced datasets.
  • SMOTE variants performed well on larger datasets with neural networks.
  • PCT-GAN reduced inter-item correlation deviation significantly compared to other methods.
  • Embedding construct-validity constraints enhances the reliability of synthetic mental health data.

Guideline-Based Recommendations

Diagnosis

  • Utilize machine-learning-based screening tools for triaging students' mental health.

Management

  • Select SMOTE-family oversamplers for transient training-set augmentation.
  • Use PCT-GAN for synthetic data intended for sharing and re-analysis.

Monitoring & Follow-up

  • Assess classification performance using Macro-F1, ROC-AUC, G-mean, and MCC.

Risks

  • Class imbalance can lead to under-prediction of urgent minority cases.

Patient & Prescribing Data

Students with varying levels of depression, anxiety, and stress.

Machine learning can enhance the identification of students needing clinical attention.

Clinical Best Practices

  • Implement psychometric validity checks when using synthetic data.
  • Regularly evaluate the performance of machine-learning classifiers on imbalanced datasets.

Related Resources & Content

Original Source(s)

Related Content