Beyond classification metrics: a psychometric-aware benchmark for data augmentation in imbalanced student mental health surveys - Scorecard - MDSpire

Beyond classification metrics: a psychometric-aware benchmark for data augmentation in imbalanced student mental health surveys

By
Chen Shao
Shengnan Qiao
Ang Li
Shipeng Liu
Yanglong Chen
Yuzhe Tan
Zhibo Liang
Xuanming Si
June 15, 2026
0 min

Frontiers In Digital Health

Share

Clinical Scorecard: A Psychometric-Informed Benchmark for Data Augmentation in Imbalanced Mental Health Surveys Among Students: Moving Beyond Classification Metrics

At a Glance

Category	Detail
Condition	Mental Health Disorders in Students
Key Mechanisms	Data augmentation strategies for improving classification utility and construct validity in mental health surveys.
Target Population	University students experiencing mental health issues.
Care Setting	Academic institutions and mental health screening environments.

Key Highlights

Benchmarking of eight data-augmentation strategies on classification utility and construct validity.
PCT-GAN showed the best performance for small, imbalanced datasets.
SMOTE variants performed well on larger datasets with neural networks.
PCT-GAN reduced inter-item correlation deviation significantly compared to other methods.
Embedding construct-validity constraints enhances the reliability of synthetic mental health data.

Guideline-Based Recommendations

Diagnosis

Utilize machine-learning-based screening tools for triaging students' mental health.

Management

Select SMOTE-family oversamplers for transient training-set augmentation.
Use PCT-GAN for synthetic data intended for sharing and re-analysis.

Monitoring & Follow-up

Assess classification performance using Macro-F1, ROC-AUC, G-mean, and MCC.

Risks

Class imbalance can lead to under-prediction of urgent minority cases.

Patient & Prescribing Data

Students with varying levels of depression, anxiety, and stress.

Machine learning can enhance the identification of students needing clinical attention.

Clinical Best Practices

Implement psychometric validity checks when using synthetic data.
Regularly evaluate the performance of machine-learning classifiers on imbalanced datasets.

Related Resources & Content

Source Article

Original Source(s)

Frontiers In Digital Health

Beyond classification metrics: a psychometric-aware benchmark for data augmentation in imbalanced student mental health surveys

by Chen Shao, Shengnan Qiao, Ang Li, Shipeng Liu, Yanglong Chen, Yuzhe Tan, Zhibo Liang, Xuanming Si
June 15, 2026

Related Content

Frontiers In Psychiatry

Occupational and psychosocial correlates of sleep disturbance among Chinese expatriate employees in Iraq’s Maysan oilfields: a cross-sectional study using regression and network analysis

Frontiers In Psychiatry

Manic episode following lurasidone initiation in bipolar I disorder: a case report

by Nasser M. Alzain, Feras A. Al-Awad, Saleh A. Alzahrani, Fahad M. Almsned
June 9, 2026

Frontiers In Immunology

TPPU protects against seizures and seizure-associated comorbidities by inhibiting the Akt/mTOR signaling pathway in KA-induced convulsant mice