Using artificial intelligence to generate medical literature for urology patients: a comparison of three different large language models - Takeaways - MDSpire

Using artificial intelligence to generate medical literature for urology patients: a comparison of three different large language models

  • By

  • David Pompili

  • Yasmina Richa

  • Patrick Collins

  • Helen Richards

  • Derek B Hennessey

  • July 29, 2024

  • 0 min

Share

  • 1

    This study compares the ability of three large language models (LLMs) to generate patient information leaflets (PILs) on urological topics.

  • 2

    The LLMs evaluated include OpenAI’s ChatGPT-4, Google’s PaLM 2, and Meta’s Llama 2, focusing on their accuracy and readability.

  • 3

    PaLM 2 produced the highest average quality score for PILs at 3.58, followed by Llama 2 at 3.34 and ChatGPT-4 at 3.08.

  • 4

    Circumcision PILs generated by PaLM 2 achieved the highest mean quality score among all leaflets evaluated.

  • 5

    The study highlights the potential of LLMs to improve patient communication in urology, though variability in performance exists.

Original Source(s)

Related Content