Performance of large language models in delivering accurate and comprehensible patient information on heart failure and cardiomyopathy - Report - MDSpire

Performance of large language models in delivering accurate and comprehensible patient information on heart failure and cardiomyopathy

  • By

  • Christoph Reich

  • Jule Leverenz

  • Charlotte Brand

  • Lasse Niemeier

  • Isabel Branzei

  • Mustafa Yildirim

  • Farbod Sedaghat-Hamedani

  • Ali Amr

  • Norbert Frey

  • Benjamin Meder

  • June 9, 2026

  • 0 min

Share

Clinical Report: Evaluation of Large Language Models in Patient Information on Heart Failure

Overview

This study evaluates the accuracy and comprehensibility of large language models (LLMs) in providing patient information on heart failure and cardiomyopathy. Six prominent LLMs were assessed on 50 expert-curated questions, revealing variability in readability and comprehensiveness, with Gemini performing the best overall.

Background

Heart failure is a complex chronic condition affecting millions, necessitating effective patient education for self-management. The shift towards digital health has made online information a primary resource for patients, but the prevalence of misinformation poses risks. Evaluating LLMs for their ability to provide accurate and understandable health information is crucial for enhancing patient education.

Data Highlights

ModelReadability (Flesch-Kincaid Grade)Composite Mean RatingPreferred Model Selection (%)
Gemini11.3 ± 1.94.41 ± 0.7743.7
GrokN/A4.23 ± 0.7630.3

Key Findings

  • Gemini provided the most readable responses but was also the most verbose.
  • Across 2,700 ratings, Gemini received the highest composite mean rating for completeness and factual reliability.
  • Confabulation avoidance scored consistently high across all models.
  • Conciseness scored the lowest among the evaluated domains.
  • Auto-graders rated the responses higher than medical students and experts.

Clinical Implications

The findings suggest that while LLMs can provide accurate information, variability in readability and comprehensibility may affect patient understanding. Healthcare professionals should be aware of these differences when recommending digital health resources to patients.

Conclusion

The evaluation of LLMs highlights their potential in delivering patient information on complex conditions like heart failure, though attention to readability and conciseness is necessary for optimal patient comprehension.

Related Resources & Content

  1. Frontiers in Digital Health, 2026 -- Designing and evaluating large language model-enabled clinical decision support for heart failure: a modular and risk-tiered framework
  2. npj Digital Medicine, 2026 -- Enhanced Transferability of Predictions from Electronic Health Records Across Different Countries and Coding Frameworks Using Large Language Models
  3. npj Digital Medicine, 2025 -- The evaluation illusion of large language models in medicine
  4. JACC, 2025 -- Pharmacologic Treatment of Heart Failure With Reduced Ejection Fraction: An Updated Systematic Review and Network Meta-Analysis
  5. the asco post — Large Language Models May Generate Concise, Coherent Pathology Summaries, Reducing Physician Burden
  6. Pharmacologic Treatment of Heart Failure With Reduced Ejection Fraction: An Updated Systematic Review and Network Meta-Analysis | JACC
  7. Effects of SGLT2 Inhibitors on Clinical Outcomes, Symptoms, Functional Capacity, and Cardiac Remodeling in Heart Failure: A Comprehensive Systematic Review and Multidomain Meta-Analysis of Randomized Trials - PMC
  8. Frailty and Effects of Semaglutide in Obesity-Related HFpEF: Findings From the STEP-HFpEF Program | JACC: Heart Failure
  9. Impact of GLP-1 receptor agonists on cardiovascular outcomes in heart failure with preserved ejection fraction (HFpEF): systematic review and meta-analysis - PubMed
  10. Intravenous Ferric Carboxymaltose in Heart Failure With Iron Deficiency: The FAIR-HF2 DZHK05 Randomized Clinical Trial | Trials | JAMA | JAMA Network
  11. Systematic review and meta-analysis of intravenous iron therapy for patients with heart failure and iron deficiency | Nature Medicine
  12. 2023 ESC Guidelines for the management of cardiomyopathies | European Heart Journal | Oxford Academic
  13. Integration of genetic testing into diagnostic pathways for cardiomyopathies: a clinical consensus statement by the ESC Council on Cardiovascular Genomics - PubMed
  14. Clinical care of family members of patients with dilated cardiomyopathy - PubMed
  15. 2024 Guideline for the Management of Hypertrophic Cardiomyopathy - Professional Heart Daily | American Heart Association
  16. Mavacamten in Symptomatic Nonobstructive Hypertrophic Cardiomyopathy - PubMed
  17. Aficamten Effective, Safe in Patients With Mild Obstructive HCM Symptoms - American College of Cardiology
  18. Long-Term Favorable Cardiac Remodeling in Obstructive Hypertrophic Cardiomyopathy Patients Treated With Mavacamten for Up to 128 Weeks: Insights From the VALOR-HCM Trial | JACC: Cardiovascular Imaging
  19. Transthyretin Cardiac Amyloidosis Evaluation and Management: 2025 ACC Concise Clinical Guidance | JACC
  20. New Proposed Criteria For Monitoring Disease Progression in ATTR-CM - American College of Cardiology
  21. Current Landscape of Therapies for Transthyretin Amyloid Cardiomyopathy | JACC: Heart Failure

Original Source(s)

Related Content