Freely accessible large language models for parent education in pediatric immune thrombocytopenia: an expert-rated cross-sectional study of safety, readability, and guideline concordance - Summary - MDSpire

Freely accessible large language models for parent education in pediatric immune thrombocytopenia: an expert-rated cross-sectional study of safety, readability, and guideline concordance

  • By

  • Orkun Dinç

  • Eray Akay

  • Belen Ates

  • July 1, 2026

  • 0 min

Share

Objective:

To assess the quality, safety, readability, and guideline/reference concordance of freely accessible LLM responses to parent-oriented questions about childhood ITP.

Approach:
  • Study Design: Cross-sectional analysis of LLM-generated responses to standardized parent-oriented questions about childhood ITP.
  • Evaluation Method: Responses were rated by clinical reviewers using the AI-ITP Parent Response Score (AI-ITP-PRS) and assessed for unsafe content and readability based on predefined criteria.
Key Findings:
  • Gemini 3 Flash had the highest mean composite score (32.78), followed by Claude Sonnet 4.6 (30.60) and GPT-5.3-mini (29.98).
  • Medical accuracy, guideline/reference concordance, and safety were largely similar across models.
  • Two instances of unsafe content were detected in responses generated by Gemini 3 Flash.
Interpretation:

Freely accessible LLMs can produce supportive explanations about childhood ITP, but the highest scores reflect communication performance rather than definitive safety superiority.

Limitations:
  • The study cannot precisely estimate true safety-event rates.
  • It does not establish robust rare-event safety differences between models.
  • Model behavior across repeated generations is not defined, and the matched categorical comparison for safety events was not statistically significant.
Conclusion:

Freely accessible LLMs can provide educational content for parents regarding childhood ITP, but caution is needed regarding safety and accuracy.

Original Source(s)

Related Content