Performance across different versions of an artificial intelligence model for screen-reading of mammograms - Scorecard - MDSpire

Performance across different versions of an artificial intelligence model for screen-reading of mammograms

  • By

  • Marthe Larsen

  • Christoph I. Lee

  • Marie B. Bergan

  • Åsne S. Holen

  • Håkon Lund-Hanssen

  • Solveig R. Hoff

  • Steinar Auensen

  • Jan F. Nygård

  • Kristina Lång

  • Yan Chen

  • Giske Ursin

  • Solveig Hofvind

  • January 13, 2026

  • 0 min

Share

Clinical Scorecard: Evaluation of Various Artificial Intelligence Models for Analyzing Mammogram Images

At a Glance

CategoryDetail
ConditionBreast cancer screening via mammography
Key MechanismsArtificial intelligence models analyze mammogram images to assign malignancy risk scores aiding radiologists in interpretation and decision-making
Target PopulationWomen aged 50–69 undergoing biennial mammography screening in BreastScreen Norway
Care SettingNational breast cancer screening program with radiologist double reading and AI-assisted interpretation

Key Highlights

  • AI models like Transpara versions 1.7 and 2.1 are FDA-cleared and CE-marked for mammography interpretation, providing malignancy risk scores to support radiologists.
  • Version updates in AI models can improve sensitivity and impact screening outcomes by altering thresholds for suspicious findings.
  • Retrospective cohort study of 117,709 mammograms from BreastScreen Norway compared AI malignancy risk scores and screening performance metrics between two AI model versions.

Guideline-Based Recommendations

Diagnosis

  • Use AI models as decision support during radiologist interpretation to improve detection rates and reduce false positives.
  • Apply AI malignancy risk scores to triage mammograms for single or double reading based on risk level.

Management

  • Recall patients with AI scores indicating intermediate to high malignancy risk (scores 8–10) for further assessment.
  • Incorporate AI updates cautiously, monitoring changes in interpretive thresholds and screening outcomes.

Monitoring & Follow-up

  • Regularly validate and quality assure AI model performance, especially after software version updates.
  • Monitor screen-detected and interval cancer rates to assess AI impact on screening program effectiveness.

Risks

  • Be aware of ethical and legal challenges related to AI use, including automation bias and cost implications.
  • Consider potential changes in AI thresholds over time that may affect population-level screening results.

Patient & Prescribing Data

Women aged 50–69 participating in biennial mammography screening in BreastScreen Norway

AI malignancy risk scores categorize exams into low (1–7), intermediate (8–9), and high (10) risk, guiding recall decisions and potentially improving cancer detection sensitivity.

Clinical Best Practices

  • Use AI as adjunct to radiologist double reading to enhance cancer detection and reduce workload.
  • Interpret AI malignancy risk scores in conjunction with radiologist assessments and consensus meetings.
  • Continuously monitor AI model performance and update protocols following software version changes.
  • Ensure ethical and legal compliance when implementing AI tools in screening programs.
  • Maintain rigorous validation and quality assurance processes for AI systems in clinical practice.

References

Original Source(s)

Related Content