Using a fine-tuned large language model for symptom-based depression evaluation - Takeaways - MDSpire

Using a fine-tuned large language model for symptom-based depression evaluation

  • By

  • Samantha Weber

  • Nicolas Deperrois

  • Robert Heun

  • Laura Frühschütz

  • Anna Monn

  • Stephanie Homan

  • Andrea Häfliger

  • Erich Seifritz

  • Tobias Kowatsch

  • Birgit Kleim

  • Sebastian Olbrich

  • October 7, 2025

  • 0 min

Share

  • 1

    A fine-tuned German BERT-based model accurately predicts MADRS scores, achieving a mean absolute error of 0.7–1.0 across depressive symptom items.

  • 2

    The model's accuracies ranged from 79% to 88%, closely matching clinician ratings and demonstrating its potential for clinical applications.

  • 3

    Fine-tuning the model resulted in a 75% reduction in prediction errors compared to the untrained model, enhancing its specificity for depressive symptom assessment.

  • 4

    The study utilized a dataset of 126 MADRS interviews, combining real patient transcripts with synthetic interviews to ensure balanced score distribution.

  • 5

    This research highlights the potential of lightweight LLMs for automated assessment and monitoring of depressive symptoms, especially in low-resource settings.

Original Source(s)

Related Content