Performance of deepseek-R1 and ChatGPT-5.4 thinking in the medical laboratory professional title examination: accuracy, stability, and comparison with interns - Takeaways - MDSpire

Performance of deepseek-R1 and ChatGPT-5.4 thinking in the medical laboratory professional title examination: accuracy, stability, and comparison with interns

By
Zhili Niu
Dongling Tang
Juanjuan Chen
Pingan Zhang
Chengliang Zhu
June 19, 2026
0 min

Frontiers In Digital Health

Share

1

The study evaluated the accuracy and reproducibility of Deepseek-R1 and ChatGPT-5.4 in the Medical Laboratory Junior Professional Title Examination.
2

Both AI models demonstrated good reproducibility with Fleiss' kappa coefficients exceeding 0.7, indicating stable performance.
3

Deepseek-R1 outperformed ChatGPT-5.4 in accuracy across most examination papers, particularly in Papers I, II, and III.
4

Interns performed comparably to AI models only on Paper I, scoring significantly lower on Papers II, III, and IV.
5

Deepseek-R1 showed superior overall performance and greater disciplinary consistency compared to ChatGPT-5.4.

Original Source(s)

Frontiers In Digital Health

Performance of deepseek-R1 and ChatGPT-5.4 thinking in the medical laboratory professional title examination: accuracy, stability, and comparison with interns

by Zhili Niu, Dongling Tang, Juanjuan Chen, Pingan Zhang, Chengliang Zhu
June 19, 2026

Related Content

The Pathologist

Portable CRISPR Takes on Mpox

Researchers assessed a portable molecular test designed for decentralized Mpox detection during an active outbreak

June 22, 2026
3 min

The Analytical Scientist

Untargeted or Non-Targeted Screening?

Is there a meaningful distinction? A panel of experts debate.

June 19, 2026
6 min

Frontiers In Medicine

Morphological characteristics of acquired corneal sub-epithelium hypertrophy: a case series

by Qiaoyu Li, Yunfan Zhang, Haimiao Lin, Zhaoxiang Lu, Wenyu Wu, Yun Feng
June 17, 2026