Cross-attention guided multi-modal network for breast ultrasound diagnosis incorporating objective clinical semantics - Report - MDSpire

Cross-attention guided multi-modal network for breast ultrasound diagnosis incorporating objective clinical semantics

  • By

  • Qin Sun

  • Xiaoman Wu

  • Ju Chen

  • Yuhang Zhang

  • Chao Zhou

  • Zheng Zhu

  • June 9, 2026

  • 0 min

Share

Clinical Report: Multi-Modal Network Utilizing Cross-Attention for Breast Ultrasound

Overview

The Cross-Attention Guided Network (CGA-Net) demonstrates improved diagnostic accuracy in breast ultrasound by integrating visual data with clinical descriptors. It achieved an Out-Of-Fold AUC of 0.905 and outperformed traditional models, suggesting a significant reduction in false positives.

Background

Breast cancer is a leading cause of cancer-related mortality among women, making accurate diagnosis crucial for improving survival rates. Breast ultrasound is a primary screening tool, yet its effectiveness is often limited by operator subjectivity and inter-observer variability. The integration of AI and structured clinical information may enhance diagnostic accuracy and reduce unnecessary procedures.

Data Highlights

ModelOut-Of-Fold AUCOverall AccuracySpecificitySensitivity
CGA-Net (trained from scratch)0.9050.8570.8310.898
CGA-Net (pre-trained)0.915N/AN/AN/A
Clinical-only baseline0.890N/AN/AN/A
Image-only baseline0.795N/AN/AN/A

Key Findings

  • The CGA-Net model integrates visual data with clinical descriptors to enhance diagnostic accuracy.
  • It achieved an Out-Of-Fold AUC of 0.905 and an overall accuracy of 0.857.
  • Specificity was recorded at 0.831, while sensitivity reached 0.898.
  • A pre-trained version of CGA-Net outperformed the clinical-only and image-only baselines.
  • Attention map visualizations indicated alignment with expert focus on tumor periphery.

Clinical Implications

CGA-Net provides a robust tool for clinicians, potentially reducing diagnostic variability in breast ultrasound. Its ability to integrate structured clinical information may enhance decision-making and improve patient outcomes.

Conclusion

CGA-Net represents a significant advancement in breast ultrasound diagnostics, effectively combining visual and clinical data to improve accuracy and reduce false positives.

Related Resources & Content

  1. npj Digital Medicine, 2026 -- Bridging radiology and pathology: domain-generalized cross-modal learning for clinical
  2. Frontiers in Oncology, 2026 -- Multimodal feature fusion model for breast mass malignant risk stratification
  3. npj Digital Medicine, 2026 -- Anatomy-guided visual prompt tuning for cross-modal breast cancer understanding
  4. Nature Digital Medicine, 2025 -- UroFusion-X: a unified multimodal deep learning framework for robust diagnosis, subtyping, and prognosis of urological cancers
  5. USPSTF Recommendation: Screening for Breast Cancer
  6. Sensitivity and specificity of mammography and adjunctive ultrasonography to screen for breast cancer in the Japan Strategic Anti-cancer Randomized Trial (J-START): a randomised controlled trial
  7. ACR BI-RADS® ATLAS — BREAST ULTRASOUND
  8. USPSTF Recommendation: Screening for Breast Cancer
  9. Sensitivity and specificity of mammography and adjunctive ultrasonography to screen for breast cancer in the Japan Strategic Anti-cancer Randomized Trial (J-START): a randomised controlled trial - ScienceDirect
  10. ACR BI-RADS® ATLAS — BREAST ULTRASOUND

Original Source(s)

Related Content