Radiologists Tested on AI X-Rays - Summary - MDSpire

Radiologists Tested on AI X-Rays

  • By

  • Doug Brunk

  • April 1, 2026

  • 4 min

Share

Objective:

To evaluate radiologists' ability to distinguish between AI-generated and real radiographs, highlighting the implications for clinical practice.

Key Findings:
  • Radiologists achieved 75% accuracy in distinguishing synthetic from real radiographs generated by GPT-4o.
  • Diagnostic accuracy for identifying abnormalities was high for both image types, reaching 92% for synthetic and 91% for real images.
  • Experience did not significantly affect performance, but musculoskeletal radiologists performed better (83% accuracy).
  • None of the tested LLMs identified all synthetic radiographs, with GPT-4o achieving 85% accuracy.
Interpretation:

The study highlights the challenges in detecting increasingly realistic synthetic medical images and underscores the urgent need for improved training and safeguards in clinical practice.

Limitations:
  • Relatively small data set and exclusion of obvious AI errors may have hindered detection, potentially skewing results.
  • Equal proportion of synthetic images does not reflect real-world prevalence, which could further impact detection accuracy.
  • Potential bias from using GPT-4o for both generation and detection raises questions about the validity of the findings.
Conclusion:

The findings underscore the risks of synthetic medical images in clinical settings and suggest the need for strategies like watermarking, provenance tracking, and automated detection tools to enhance safety.

Original Source(s)

Related Content