Performance of large language models in delivering accurate and comprehensible patient information on heart failure and cardiomyopathy - Takeaways - MDSpire

Performance of large language models in delivering accurate and comprehensible patient information on heart failure and cardiomyopathy

  • By

  • Christoph Reich

  • Jule Leverenz

  • Charlotte Brand

  • Lasse Niemeier

  • Isabel Branzei

  • Mustafa Yildirim

  • Farbod Sedaghat-Hamedani

  • Ali Amr

  • Norbert Frey

  • Benjamin Meder

  • June 9, 2026

  • 0 min

Share

  • 1

    This study evaluated six large language models (LLMs) for their accuracy and comprehensibility in providing patient information on heart failure and cardiomyopathy.

  • 2

    Gemini received the highest composite mean rating for readability and factual reliability among the evaluated LLMs, followed by Grok.

  • 3

    The evaluation involved 50 expert-curated questions and responses rated by twelve reviewers across nine domains, including appropriateness and empathy.

  • 4

    All LLMs demonstrated good accuracy in avoiding medical misinformation, though variability existed in readability and comprehensiveness.

  • 5

    The study highlights the need for rigorous evaluation of LLMs to ensure their reliability and accessibility for patient education in chronic disease management.

Original Source(s)

Related Content