Assessing retina-specific ophthalmic counseling generated by an early public large language model across different levels of clinical urgency - Summary - MDSpire
Advertisement
Assessing retina-specific ophthalmic counseling generated by an early public large language model across different levels of clinical urgency
To evaluate how the quality of retina-specific ophthalmology counseling provided by an early publicly available large language model (LLM) differs when advising patients with varying clinical characteristics and risk factors.
Approach:
Study Design: Prospective, cross-sectional study involving 18 ophthalmologists rating LLM-generated counseling based on six patient vignettes with varying clinical urgencies.
Vignette Creation: Six clinical vignettes were constructed representing high- and low-urgency scenarios for diabetic retinopathy, retinal detachment, and age-related macular degeneration.
LLM Interaction: The LLM (ChatGPT-3.5) was prompted to provide medical counseling based on the vignettes, with responses rated by independent reviewers.
Readability Assessment: Readability of the counseling outputs was evaluated using qualitative surveys and quantitative metrics.
Key Findings:
Counseling accuracy varied with clinical urgency (p = 0.002), particularly for retinal detachment (p < 0.001).
Counseling urgency did not significantly differ from clinical urgency in most vignettes, except for high-urgency AMD (p = 0.013) and RD (p < 0.001).
Empathy in counseling did not differ across clinical urgency (p = 0.2).
Readability assessments indicated that college graduation was required to understand all counseling outputs.
Common reasons for difficulty in understanding included excessive medical (49%) and non-medical (45%) terminology.
Interpretation:
The LLM-generated counseling outputs were largely similar across retinal vignettes with differing clinical urgency, indicating a need for optimization in LLM prompting for better accuracy and readability.
Limitations:
The study was limited to a single LLM version and a small sample of ophthalmologists, which may affect the generalizability of the findings.
Readability assessments may not fully capture the nuances of patient understanding.
Conclusion:
Future studies should investigate the optimization of LLM prompting to improve counseling accuracy, readability, empathy, and communication of urgency for specific conditions.
Christina Weng, MD, MBA, presents a case of secondary IOL tilt correction, highlighting a novel 5-point scaffold technique for lens stabilization, and a panel of expert vitreoretinal surgeons discusses rescue strategies, IOL exchange vs preservation, and pearls for managing postoperative lens tilt.
Mrinali Gupta, MD, FASRS, demonstrates surgical repair of a traumatic giant retinal tear (GRT) detachment in a 50-year-old phakic patient, and a panel of experienced vitreoretinal surgeons discusses approaches to scleral buckling, tamponade selection, laser retinopexy, and techniques to prevent slippage.