Performance across different versions of an artificial intelligence model for screen-reading of mammograms

By
Marthe Larsen
Christoph I. Lee
Marie B. Bergan
Åsne S. Holen
Håkon Lund-Hanssen
Solveig R. Hoff
Steinar Auensen
Jan F. Nygård
Kristina Lång
Yan Chen
Giske Ursin
Solveig Hofvind
January 13, 2026
0 min

European Radiology

Overview

This study compared two versions of a commercial AI model for mammography interpretation in a large national screening program. The updated AI version showed changes in malignancy risk scoring that impacted screening performance metrics, including screen-detected and interval cancer rates. Differences in tumor characteristics and mammographic features were also analyzed relative to AI risk scores.

Background

Breast cancer screening with mammography is evolving with the integration of artificial intelligence (AI) to improve interpretation accuracy and outcomes. Several AI models have regulatory clearance and are used to support radiologists by triaging exams or supplementing double reading. However, challenges remain regarding the impact of AI software updates on screening performance, ethical and legal considerations, and implementation costs. Understanding how successive AI model versions affect clinical outcomes is critical for optimizing screening programs.

Data Highlights

Parameter	Version 1.7	Version 2.1
Number of Screening Exams	~117,709	~117,709
AI Risk Score Categories	1–7 low, 8–9 intermediate, 10 high	Same categorization
Screen-Detected Cancer Definition	Invasive cancer or DCIS after recall	Same
Interval Cancer Definition	Invasive cancer or DCIS within 24 months post-negative screen	Same
AI Model Updates	Original algorithm	Architectural changes, expanded training data, updated sampling

Key Findings

Version 2.1 of the AI model incorporated architectural algorithm changes and expanded training data from multiple vendors and centers.
AI malignancy risk scores from version 2.1 differed from version 1.7, potentially altering thresholds for suspicious findings.
Screen-detected and interval cancer rates varied when using AI scores from the two versions, indicating impact on screening outcomes.
Histopathological tumor characteristics and mammographic features showed differences in distribution relative to AI risk scores between versions.
Version 2.1 demonstrated improved sensitivity compared to humans and prior AI versions in detecting malignancies.

Clinical Implications

Clinicians should be aware that updates to AI mammography models can change malignancy risk scoring and influence screening performance metrics. Continuous validation and quality assurance are essential when implementing new AI software versions to maintain or improve cancer detection rates and minimize false positives. Understanding these changes aids in optimizing recall decisions and resource allocation in breast cancer screening programs.

Conclusion

This study highlights that version updates in commercial AI models for mammography can significantly impact interpretive performance and screening outcomes. Ongoing evaluation of AI software versions is crucial to ensure consistent and improved breast cancer detection in population-based screening.

References

BreastScreen Norway Program Data and Ethics Approvals
ScreenPoint Medical BV - Transpara AI Model Versions 1.7 and 2.1
St. Gallen Molecular Subtype Classification

Performance across different versions of an artificial intelligence model for screen-reading of mammograms

Evaluation of AI Model Version Updates for Mammogram Analysis in BreastScreen Norway

Overview

Background

Data Highlights

Key Findings

Clinical Implications

Conclusion

References

Original Source(s)

Performance across different versions of an artificial intelligence model for screen-reading of mammograms

Related Content

New Mammography Screening Guidance: Clinical Considerations for Risk-Based Care

Integration of Surface-Guided Radiation Therapy with Real-Time Position Management for Deep Inspiratory Breath-Hold in Treating Left Breast Cancer

Noninvasive Evaluation of Ki-67 Overexpression in Breast Cancer Using Ultrasound Radiomics and Habitat Analysis