Multimodal machine learning for video based single question mental health assessment

By
Bradley Grimm
Pernille Yilmam
Brett Talbot
Loren Larsen
December 16, 2025
0 min

Npj Digital Medicine

Objective:

To demonstrate that a single video question can effectively predict self-reported depression, anxiety, and trauma through text and voice analysis, addressing the need for efficient mental health assessments.

Key Findings:

Achieved 64.6% reduction in assessment time (78.4 s vs 221.7 s) while screening for all three conditions, indicating significant efficiency.
Only 1.4% of participants were unwilling to use video-based screening, suggesting high acceptance.
Demonstrated strong predictive performance and demographic consistency across age, gender, and race/ethnicity, reinforcing the model's applicability.

Interpretation:

The study supports the feasibility of efficient multi-condition mental health screening using brief video responses, particularly in the context of increasing provider shortages.

Limitations:

The model does not directly assess trauma exposure despite predicting PCL-5 scores, which may limit its comprehensiveness.
Further validation may be needed across larger and more diverse populations to ensure generalizability.

Conclusion:

The study presents a scalable and efficient method for mental health assessment that could alleviate the burden on healthcare providers and improve patient engagement.

Multimodal machine learning for video based single question mental health assessment

Objective:

Key Findings:

Interpretation:

Limitations:

Conclusion:

Original Source(s)

Multimodal machine learning for video based single question mental health assessment

Related Content

The Oversight of Mental Illness in Conversations Surrounding 'Brain Health

Virtual nature, real relief: how exposure to virtual natural environments reduces anxiety, stress, and depression in healthy adults

Personalised modelling of routine variability and affective states