Reply to: Clarifying multimodal inputs and attribution in colorectal cancer survival prediction - Takeaways - MDSpire

Reply to: Clarifying multimodal inputs and attribution in colorectal cancer survival prediction

  • By

  • Yiqing Jiang

  • Yue Tian

  • Chenshen Huang

  • May 20, 2026

  • 0 min

Share

  • 1

    The survival prediction results were obtained using CT-only inference in the HydraMamba framework.

  • 2

    Cross-Modal Representation Learning trains a shared backbone using CT and endoscopic images for improved anatomical representation.

  • 3

    The ViT pooling (CT+Endo) baseline refers to a training dataset composition, not additional test-time inputs.

  • 4

    The survival head in HydraMamba receives a CT-derived representation from late fusion of anatomy tokens and style vectors.

  • 5

    Endoscopic data aids in training by providing textural priors, enhancing the CT features used for survival prediction.

Original Source(s)

Related Content