Automated emotion recognition via video-based semantic embeddings - Report - MDSpire

Automated emotion recognition via video-based semantic embeddings

  • By

  • Hannes Diemerling

  • Patricia Kulla

  • Joachim Kruse

  • Timo von Oertzen

  • May 29, 2026

  • 0 min

Share

Clinical Report: Video-Based Semantic Embeddings for Automated Recognition of Emotions

Overview

This study presents a novel automated emotion recognition system utilizing a large corpus of authentic facial expressions from psychotherapy sessions. The model demonstrates strong alignment with human annotations and effective recognition of key emotions such as joy, sadness, and fear.

Background

Accurate emotion recognition is crucial in clinical settings, particularly in psychotherapy, where it informs treatment decisions. Traditional methods often rely on acted datasets, which may not capture the complexity of spontaneous emotional expressions. This research aims to enhance emotion recognition through advanced modeling techniques that reflect real-world emotional dynamics.

Data Highlights

Leave-one-out cross-validation yielded a mean z-score of 1.97, indicating strong model performance. External evaluation against the RAVDESS dataset confirmed effective recognition of joy, sadness, and fear.

Key Findings

  • The model was trained on a large corpus of authentic facial emotion expressions from psychotherapy sessions.
  • Human annotations were embedded in a 768-dimensional semantic space using a fine-tuned German Sentence-BERT model.
  • Transformer, BILSTM, and deep neural network architectures were employed to map facial landmark features to continuous emotion embeddings.
  • A back-translation mechanism using cosine similarity was implemented for enhanced interpretability.
  • The system, named AFFECT, is an open-source pipeline for analyzing emotional expressions in everyday video recordings.

Clinical Implications

The findings suggest that automated emotion recognition systems can provide valuable support in clinical settings by offering objective assessments of patients' emotional states. This may enhance the clinician's ability to tailor interventions based on real-time emotional feedback.

Conclusion

The development of AFFECT represents a significant advancement in automated emotion recognition, with potential applications in clinical practice and beyond.

Related Resources & Content

  1. npj Digital Medicine, 2025 -- Evaluating the performance of general purpose large language models in identifying human facial emotions
  2. BMC Psychiatry (Springer), 2025 -- Evaluating Diagnostic Precision in Emotion Recognition and Visual Preference Tasks for ASD Screening in Children
  3. e-Motions, 2025 -- A New Intraoperative Assessment Tool for Mapping Social Cognition with Triple Validation Across Normative, Schizophrenia, and Autism Spectrum Disorder Groups
  4. npj Digital Medicine — Quantitative Evaluation of Atypical Facial Expression Patterns in Children with Autism Spectrum Disorder Through Naturalistic Interaction Dynamics
  5. Predetermined Change Control Plans for Machine Learning-Enabled Medical Devices: Guiding Principles | FDA
  6. Artificial Intelligence in Psychiatric Care | American Psychiatric Association
  7. Measurement-Based Care – Standardized Tools and Instruments | Joint Commission
  8. Ethics Code Updates to the Publication Manual
  9. Security Rule Guidance Material | HHS.gov
  10. AI-based recognition of facial and micro-expressions for the diagnosis of mental and neurological disorders: a systematic review - PMC
  11. AI-assisted multi-modal information for the screening of depression: a systematic review and meta-analysis | npj Digital Medicine
  12. Multimodal machine learning for video based single question mental health assessment | npj Digital Medicine
  13. Journal of Medical Internet Research - Facial Emotion Recognition of 16 Distinct Emotions From Smartphone Videos: Comparative Study of Machine Learning and Human Performance

Original Source(s)

Related Content