Large language model chatbots as sources of pediatric anesthesia health advice: An evaluation of reliability and readability - Summary - MDSpire

Large language model chatbots as sources of pediatric anesthesia health advice: An evaluation of reliability and readability

  • By

  • Xue Zhang

  • Yuchen Dai

  • Xin Zhao

  • Lin Wu

  • Boming Shao

  • Xisheng Shan

  • Fuhai Ji

  • Runzhi Deng

  • Baojian Zhao

  • June 29, 2026

  • 0 min

Share

Objective:

To evaluate the reliability and accessibility of health information provided by large language models (LLMs) for pediatric anesthesia, with the aim of informing the integration of AI-driven tools into perioperative education.

Approach:
  • Study Design: A cross-sectional observational analysis was conducted, following STROBE and CHART guidelines, assessing 72 responses from four LLM-based chatbots to pediatric anesthesia questions.
  • Data Collection: Standardized search terms related to pediatric anesthesia were identified and used to generate prompts for the chatbots, reflecting real-world caregiver concerns.
Key Findings:
  • The study evaluated responses from ChatGPT, Claude, DeepSeek, and Gemini regarding pediatric anesthesia.
  • Responses were assessed for readability, informational reliability, and overall quality, revealing variability in performance across the models.
Interpretation:

The findings indicate a need for accurate and comprehensible information from LLMs to support caregivers in pediatric anesthesia contexts.

Limitations:
  • The study did not perform a formal statistical sample size calculation due to its observational nature.
  • Responses were limited to predefined prompts and may not encompass all caregiver concerns.
Conclusion:

The study highlights the necessity of evaluating LLM-generated health information for pediatric anesthesia to ensure it is safe and clear for caregivers.

Original Source(s)

Related Content