Evaluating the accuracy and communication quality of large language models in Ewing sarcoma: a comparative analysis of ChatGPT, Claude, Gemini, DeepSeek, and Grok

By
Cihan Ünyılmaz
June 30, 2026
0 min

Frontiers In Pediatrics

Objective:

To compare the clinical accuracy, comprehensiveness, and communication quality of five widely used large language models (LLMs) in answering frequently asked questions about Ewing sarcoma.

Approach:

Evaluation Method: Twelve representative questions were presented to five LLMs, and responses were evaluated by two orthopedic oncology specialists using a 4-point Likert scale for clinical accuracy, completeness, clarity, and relevance.
Statistical Analysis: Statistical analyses included Friedman, Wilcoxon signed-rank, Kruskal–Wallis, and Mann–Whitney U tests.

Key Findings:

Significant differences were observed among the five LLMs (p < 0.001).
ChatGPT achieved the highest overall performance, followed by Claude and DeepSeek.
DeepSeek demonstrated the greatest technical accuracy but lower communication quality.
ChatGPT provided the best balance between factual correctness and patient-friendly communication.
Gemini and Grok produced more superficial responses with lower overall scores.

Interpretation:

Variability among models remains substantial, necessitating further validation and disease-specific optimization.

Limitations:

The study's findings are based on a limited number of questions and LLMs.
Responses were evaluated by only two specialists, which may affect the reliability of the assessments.

Evaluating the accuracy and communication quality of large language models in Ewing sarcoma: a comparative analysis of ChatGPT, Claude, Gemini, DeepSeek, and Grok

Objective:

Approach:

Key Findings:

Interpretation:

Limitations:

Conclusion:

Original Source(s)

Evaluating the accuracy and communication quality of large language models in Ewing sarcoma: a comparative analysis of ChatGPT, Claude, Gemini, DeepSeek, and Grok

Related Content

Case Report: Primary sporadic intramedullary malignant peripheral nerve sheath tumor with intracranial extension and a subpial nodule suggestive of early dissemination

ESMO Breast Cancer 2026 Highlights: Pumitamig + DB-1305/BNT325

ESMO Breast Cancer 2026 Highlights: Saci-IO HR+