Multimodal behavioral phenotyping for depressive-spectrum classification and severity estimation using eye tracking, facial behavior, and transcript-derived language - Report - MDSpire

Multimodal behavioral phenotyping for depressive-spectrum classification and severity estimation using eye tracking, facial behavior, and transcript-derived language

  • By

  • Xiang-Ting Chen

  • Min Huang

  • June 16, 2026

  • 0 min

Share

Clinical Report: Comprehensive Behavioral Profiling for Classifying Depression

Overview

This study presents a multimodal framework for classifying depression spectrum and assessing severity using eye tracking, facial expressions, and language analysis. The framework demonstrated high accuracy in classification and improved calibration in predicting depression severity compared to existing models.

Background

Depression is a leading cause of disability globally, yet its assessment often relies on subjective reports and clinician judgment. There is a pressing need for objective tools that can accurately classify depressive states and estimate severity, particularly for subthreshold depression, which is prevalent and associated with significant functional impairment. This study addresses these gaps by integrating multiple behavioral modalities for a more comprehensive assessment.

Data Highlights

MetricBaseline-3Baseline-3+
Accuracy~0.90~0.90
Balanced Accuracy~0.90~0.90
F1-macro~0.90~0.90
Expected Calibration ErrorHigherLower

Key Findings

  • Baseline-3+ achieved high accuracy and balanced accuracy near 0.90 for depression classification.
  • Facial features were identified as the dominant signal for classification, followed by eye tracking and language contributions.
  • Misclassification was primarily observed near the boundary between subthreshold depression and normal controls.
  • The framework effectively handled missing modalities, enhancing the robustness of predictions.
  • Interpretability analyses confirmed stable quality-aware modality reweighting in the model.

Clinical Implications

The multimodal framework can augment traditional clinical assessments by providing objective data on depressive-spectrum classification and severity estimation. This is particularly beneficial for identifying patients in boundary states, such as subthreshold depression, who may require targeted interventions.

Conclusion

The integration of eye tracking, facial expressions, and language analysis offers a promising approach to enhance the accuracy and objectivity of depression assessments, potentially transforming clinical practice.

Related Resources & Content

  1. Frontiers in Psychiatry, 2026 -- Recognizing anxiety and depression in cancer patients based on speech and facial expressions
  2. Frontiers in Digital Health, 2026 -- Utilizing Deep Learning and Large Language Models for Multimodal Detection of Depression
  3. BMC Psychiatry, 2025 -- Diagnostic accuracy of traditional and deep learning methods for detecting depression based on speech features: a systematic review and meta-analysis
  4. January 2026 exceptional surveillance of depression in adults: treatment and management (NICE guideline NG222)
  5. Frontiers in Digital Health — Depression subtype classification from social media posts: few-shot prompting vs. fine-tuning of large language models
  6. AI-assisted multi-modal information for the screening of depression: a systematic review and meta-analysis
  7. An App-Based WHO Mental Health Guide for Depression Detection: A Cluster Randomized Clinical Trial - PMC
  8. January 2026 exceptional surveillance of depression in adults: treatment and management (NICE guideline NG222)

Original Source(s)

Related Content