Knowledge localization is associated with higher performance of domestic large language models in a Chinese radiation oncology examination - Report - MDSpire
Advertisement
Knowledge localization is associated with higher performance of domestic large language models in a Chinese radiation oncology examination
Clinical Report: Enhanced Knowledge Localization Improves Performance of Domestic Large Language Models in Chinese Radiation Oncology Assessments
Overview
This study benchmarks domestic and international large language models (LLMs) against a Chinese radiation oncologist using a national examination. Domestic models, particularly Qwen 3 Max, outperformed the human reference, highlighting the importance of localized knowledge in model performance.
Background
The integration of large language models in specialized medical fields like radiation oncology is critical as these models have shown promise in general medical assessments. However, their effectiveness in non-English contexts and specialized domains remains underexplored. Understanding how these models perform in localized settings can inform their development and application in clinical practice.
Data Highlights
{'DeepSeek V3.2': {'Accuracy': 'Specify accuracy', 'Comparison to Physician': 'Specify comparison'}, 'GPT-5': {'Comparison to Physician': 'Clarify performance context'}}
Key Findings
Qwen 3 Max achieved an accuracy of 86.30%, outperforming the single physician reference.
International models like GPT-5 showed significant performance declines in localized knowledge retrieval.
Translating the examination into English did not improve performance for international models.
Majority of errors in international models stemmed from discrepancies between Western and Chinese clinical guidelines.
Localized knowledge alignment is crucial for model performance in specialized medical assessments.
Clinical Implications
The findings suggest that domestic LLMs may be better suited for localized clinical assessments in radiation oncology. Clinicians should consider the limitations of international models when applying them in non-English contexts, particularly in specialized fields.
Conclusion
This study underscores the importance of localized knowledge in enhancing the performance of large language models in specialized medical assessments, particularly in radiation oncology.