Reliability and readability of adenoid hypertrophy information generated by five publicly accessible LLM chatbots: A default-setting snapshot study - Summary - MDSpire
Advertisement
Reliability and readability of adenoid hypertrophy information generated by five publicly accessible LLM chatbots: A default-setting snapshot study
To evaluate the reliability and readability of information about adenoid hypertrophy generated by five large language models (LLMs) for parent education.
Approach:
Study Design: A comparative evaluation of five LLMs (ChatGPT, DeepSeek, Gemini, Perplexity, and Copilot) was conducted using a standardized question bank related to adenoid hypertrophy.
Data Collection: Data were collected from the LLMs using a set of 63 questions derived from public sources, focusing on various aspects of adenoid hypertrophy.
Ethics and Reporting: The study adhered to the CHART reporting guideline and did not require ethics committee approval as no real patients were involved.
Key Findings:
The prevalence of adenoid hypertrophy among children aged 3 to 8 years is reported to be as high as 34%–70%.
Existing studies on LLM performance in medical contexts show mixed results regarding reliability and readability.
No prior systematic evaluation of LLMs for pediatric otorhinolaryngology conditions like adenoid hypertrophy has been conducted.
Interpretation:
Limitations:
The study did not involve direct patient or public involvement.
Potential biases inherent in AI-assisted approaches were acknowledged but not fully explored.
A small randomized trial suggested lower repeat tympanostomy tube placement rates following tube extrusion among pediatric patients who used home autoinflation therapy.