Clinical evaluation of large language model recommendations in melanoma: comparison with multidisciplinary tumor board decisions in a real-world cohort - Report - MDSpire

Clinical evaluation of large language model recommendations in melanoma: comparison with multidisciplinary tumor board decisions in a real-world cohort

  • By

  • Belma Babic

  • Sefika Umihanic

  • Hedim Osmanovic

  • Nejra Selak

  • Erna Sehic-Kozica

  • Lejla Moranjkic

  • Inga Marijanovic

  • Marija Karaga

  • Amina Jalovcic Suljevic

  • Sekib Umihanic

  • Fadil Umihanic

  • Arzumana Ozegovic-Orucevic

  • June 29, 2026

  • 0 min

Share

Clinical Report: Assessment of Large Language Model Suggestions in Melanoma

Overview

This study evaluates the performance of four large language models (LLMs) in generating treatment recommendations for melanoma compared to a multidisciplinary tumor board's decisions.

Background

Malignant melanoma is a significant global health challenge, with rising incidence rates and a need for effective treatment strategies. Multidisciplinary tumor boards (MDTs) play a crucial role in decision-making for melanoma management, particularly in resource-limited settings. The integration of large language models (LLMs) into this process requires thorough evaluation.

Data Highlights

LLMPerformance Rating
ChatGPT-5 ThinkingStrongest
ChatGPT-4oModerate
Gemini 2.5 ProLess Favorable
DeepSeek-V3.2Least Favorable

Key Findings

  • Inter-rater reliability among oncologists was acceptable to good.
  • ChatGPT-5 Thinking showed consistent performance across evaluated domains.
  • Statistically significant differences were observed between the LLMs in all domains assessed.
  • Performance differences were most relevant in complex treatment scenarios.
  • LLM-generated recommendations should not replace independent treatment decisions.

Clinical Implications

The findings indicate that LLMs may have a role in melanoma treatment decision-making, but their recommendations should be used as supportive tools rather than as standalone treatment decisions.

Conclusion

This study emphasizes the need for further research before LLMs can be integrated into clinical workflows.

Related Resources & Content

  1. Author(s)/Org, Source, Year -- Title
  2. Author(s)/Org, Source, Year -- Title
  3. Author(s)/Org, Source, Year -- Title
  4. Author(s)/Org, Source, Year -- Title
  5. Cutaneous melanoma: ESMO Clinical Practice Guideline for diagnosis, treatment and follow-up - ScienceDirect
  6. NCCN GUIDELINES® INSIGHTS Melanoma: Cutaneous
  7. Three-Year Overall Survival With Nivolumab Plus Relatlimab in Advanced Melanoma From RELATIVITY-047 - PMC
  8. Cutaneous melanoma: ESMO Clinical Practice Guideline for diagnosis, treatment and follow-up - ScienceDirect
  9. CE NCCN GUIDELINES® INSIGHTS Melanoma: Cutaneous,
  10. Three-Year Overall Survival With Nivolumab Plus Relatlimab in Advanced Melanoma From RELATIVITY-047 - PMC

Original Source(s)

Related Content