Knowledge localization is associated with higher performance of domestic large language models in a Chinese radiation oncology examination - Summary - MDSpire

Knowledge localization is associated with higher performance of domestic large language models in a Chinese radiation oncology examination

  • By

  • Yuchen Zhou

  • Shuyu Lin

  • Xinhai Wang

  • Ke Hu

  • June 17, 2026

  • 0 min

Share

Objective:

To evaluate the performance of both domestic and international large language models (LLMs) in the context of Chinese radiation oncology assessments.

Approach:
    Key Findings:
    • Domestic models, particularly Qwen, achieved an accuracy of 86.30%, surpassing the performance of the single physician reference participant (adjusted P = 0.020).
    • International models showed a marked decline in performance, particularly in localized knowledge retrieval, emphasizing the need for alignment with regional standards.
    • Translating the examination into English did not improve performance for international models and revealed a significant language penalty for some domestic architectures (e.g., DeepSeek, P = 0.013).
    • Error analysis indicated that failures in international models were primarily due to discrepancies between Western and Chinese clinical guidelines.
    Interpretation:

    The findings suggest that alignment with regional clinical standards is a significant factor influencing model performance in this context, highlighting the need for localized knowledge.

    Limitations:
    • Only one human participant was included for comparison, limiting the generalizability of the results and potentially affecting the conclusions drawn.
    • Differences in model architecture, training data, and potential test-set contamination may also affect outcomes.
    Conclusion:

    The study highlights the importance of localized knowledge in enhancing the performance of LLMs in specialized medical assessments.

Original Source(s)

Related Content