Hierarchical Mamba-CNN Transducer for Enhanced Liver Tumor Segmentation in CT Imaging
Overview
The hierarchical mamba-CNN transducer (HMC-transducer) introduces a novel hybrid architecture combining CNNs with the Mamba state space model to improve liver tumor segmentation from CT scans. It achieves state-of-the-art accuracy, superior generalization, and computational efficiency on multiple public datasets.
Background
Accurate liver tumor segmentation from CT imaging is essential for diagnosis and treatment planning but is challenged by tumor heterogeneity and unclear boundaries. Traditional CNNs capture local features well but lack long-range spatial modeling, while transformers provide global context at high computational cost. The HMC-transducer addresses these limitations by integrating CNNs with a direction-aware 3D Mamba block, enabling efficient volumetric data processing and adaptive fusion of local and global features.
Data Highlights
Dataset
Model
Performance
Computational Efficiency
LiTS17
HMC-Transducer
New state-of-the-art segmentation accuracy
Superior to CNN and transformer baselines
MSD-liver
HMC-Transducer
Improved generalization
Efficient linear-complexity modeling
KiTS21
HMC-Transducer
High segmentation accuracy
Computationally efficient for 3D volumes
Key Findings
The HMC-transducer combines CNNs with the linear-complexity Mamba state space model for effective local and global feature integration.
The direction-aware 3D Mamba block preserves spatial topology along all three axes, enhancing volumetric data processing.
A gated fusion mechanism adaptively weighs local and global features at each network hierarchy level.
Extensive evaluation on LiTS17, MSD-liver, and KiTS21 datasets demonstrates superior segmentation accuracy over existing CNN and transformer models.
The model achieves better generalization and computational efficiency, addressing the quadratic cost limitations of transformers in 3D imaging.
Clinical Implications
The HMC-transducer offers a practical and generalizable solution for liver tumor segmentation in clinical CT imaging, potentially improving diagnostic accuracy and treatment planning. Its computational efficiency facilitates deployment in real-world clinical workflows, enabling faster and more reliable tumor delineation.
Conclusion
The hierarchical mamba-CNN transducer represents a significant advancement in liver tumor segmentation by effectively integrating local and global features with computational efficiency. This approach sets a new benchmark for accuracy and generalization in volumetric medical image analysis.
A VHA study across 11 vendors finds AI-generated primary care notes score lower than clinician-written notes, with the largest deficits in thoroughness, organization, and usefulness