Using artificial intelligence to generate medical literature for urology patients: a comparison of three different large language models

By
David Pompili
Yasmina Richa
Patrick Collins
Helen Richards
Derek B Hennessey
July 29, 2024
0 min

World Journal Of Urology

Objective:

To explore the ability of multiple mainstream LLMs to generate accurate patient information leaflets (PILs) on urological topics and assess their readability, with a focus on comparing their performance.

Key Findings:

PaLM 2 generated the highest average quality score (3.58) for PILs, followed by Llama 2 (3.34) and ChatGPT-4 (3.08). No statistically significant differences in quality scores were observed between the PILs generated by the LLMs for the assessed topics, indicating a need for further investigation.

Interpretation:

The study indicates that LLMs, particularly PaLM 2, can produce high-quality patient information leaflets, which may enhance patient understanding of urological conditions.

Limitations:

The study focused on only three LLMs and four urological topics, limiting the generalizability of the findings. Additionally, the evaluation was conducted by a panel of clinicians, which may introduce bias in scoring and affect the reliability of the results.

Conclusion:

LLMs show promise in generating patient information leaflets that are both accurate and accessible, with PaLM 2 performing particularly well in this context. Future research should explore a broader range of LLMs and topics to validate these findings.

Using artificial intelligence to generate medical literature for urology patients: a comparison of three different large language models

Objective:

Key Findings:

Interpretation:

Limitations:

Conclusion:

Original Source(s)

Using artificial intelligence to generate medical literature for urology patients: a comparison of three different large language models

Related Content

Effectiveness and safety of ultra-slow full power shockwave lithotripsy compared to mini–percutaneous nephrolithotomy and retrograde intrarenal surgery for treatment of lower calyceal stone between 1 and 2 cm with high attenuation value

Simulation-based training in minimally invasive surgical therapies (MIST): current evidence and future directions for artificial intelligence integration—a systematic review by EAU endourology

Analyzing retraction trends in urology: a comprehensive study over the last decade