Evaluating the accuracy and communication quality of large language models in Ewing sarcoma: a comparative analysis of ChatGPT, Claude, Gemini, DeepSeek, and Grok

By
Cihan Ünyılmaz
June 30, 2026
0 min

Frontiers In Pediatrics

Overview

This study evaluates the performance of five large language models (LLMs) in providing information about Ewing sarcoma.

Background

Ewing sarcoma is a rare and aggressive pediatric cancer that requires complex management involving multidisciplinary teams. Accurate communication is critical, as families seek reliable information about diagnosis, treatment options, and prognosis. The use of LLMs for medical information necessitates assessment of their effectiveness in delivering quality education.

Data Highlights

Model	Overall Performance	Technical Accuracy	Communication Quality
ChatGPT	Highest	Moderate	Best
Claude	Second	Moderate	Good
DeepSeek	Third	Highest	Lower
Gemini	Lower	Low	Low
Grok	Lowest	Low	Low

Key Findings

ChatGPT achieved the highest overall performance among the LLMs evaluated.
DeepSeek demonstrated the greatest technical accuracy but lower communication quality.
Gemini and Grok produced more superficial responses with lower overall scores.
Significant differences in performance were observed among the five LLMs (p < 0.001).

Clinical Implications

Current LLMs can support patient education but should not replace specialist consultation.

Conclusion

This study emphasizes the need for careful evaluation and validation of LLMs before their routine use in clinical practice.

Evaluating the accuracy and communication quality of large language models in Ewing sarcoma: a comparative analysis of ChatGPT, Claude, Gemini, DeepSeek, and Grok

Clinical Report: Assessing the Precision and Communication Effectiveness of LLMs

Overview

Background

Data Highlights

Key Findings

Clinical Implications

Conclusion

Related Resources & Content

Original Source(s)

Evaluating the accuracy and communication quality of large language models in Ewing sarcoma: a comparative analysis of ChatGPT, Claude, Gemini, DeepSeek, and Grok

Related Content

ESMO Breast Cancer 2026 Highlights: Saci-IO HR+

Aerobic exercise training attenuates the deleterious effects of walker-256 cancer: effects on physical capacity, cachexia, and cardiac mass

Evaluating glycolysis-associated biomarkers for radiotherapy sensitivity in head and neck squamous cancer