Reliability and readability of adenoid hypertrophy information generated by five publicly accessible LLM chatbots: A default-setting snapshot study

By
Xiaoming Qian
Zhishui Wu
Jing Li
Qiuyu Su
Qian Qin
Beibei Zhang
July 2, 2026
0 min

Digital Health

Objective:

To evaluate the reliability and readability of information about adenoid hypertrophy generated by five large language models (LLMs) for parent education.

Approach:

Study Design: A comparative evaluation of five LLMs (ChatGPT, DeepSeek, Gemini, Perplexity, and Copilot) was conducted using a standardized question bank related to adenoid hypertrophy.
Data Collection: Data were collected from the LLMs using a set of 63 questions derived from public sources, focusing on various aspects of adenoid hypertrophy.
Ethics and Reporting: The study adhered to the CHART reporting guideline and did not require ethics committee approval as no real patients were involved.

Key Findings:

The prevalence of adenoid hypertrophy among children aged 3 to 8 years is reported to be as high as 34%–70%.
Existing studies on LLM performance in medical contexts show mixed results regarding reliability and readability.
No prior systematic evaluation of LLMs for pediatric otorhinolaryngology conditions like adenoid hypertrophy has been conducted.

Interpretation:

Limitations:

The study did not involve direct patient or public involvement.
Potential biases inherent in AI-assisted approaches were acknowledged but not fully explored.

Reliability and readability of adenoid hypertrophy information generated by five publicly accessible LLM chatbots: A default-setting snapshot study

Objective:

Approach:

Key Findings:

Interpretation:

Limitations:

Conclusion:

Original Source(s)

Reliability and readability of adenoid hypertrophy information generated by five publicly accessible LLM chatbots: A default-setting snapshot study

Related Content

AI Scribes: Efficiency for Whom?

Autoinflation May Reduce Repeat Ear Tubes

Medical Oddities: Something Viral is Lurking in the Dust