PAM: Efficient 3D Object Segmentation from Minimal 2D Prompts in Medical Imaging
Overview
PAM is a novel propagation-based framework that generates accurate 3D segmentations from minimal 2D prompts by integrating CNN and Transformer architectures. It outperforms existing models like MedSAM and SegVol across 44 diverse datasets, improving average Dice similarity coefficient by 19.3%, reducing user interaction time by 63.6%, and maintaining robust performance across varying prompts and propagation settings.
Background
Volumetric segmentation in medical imaging is critical for delineating anatomical structures across modalities such as CT, MRI, PET-CT, and SRX. Manual segmentation remains labor-intensive and time-consuming, necessitating automated methods that generalize across objects and modalities. Existing deep learning models often require extensive task-specific training and annotations, limiting their adaptability. Foundation models like SAM have shown promise in natural image segmentation but face challenges when applied directly to 3D medical imaging due to inter-slice continuity and volumetric complexity.
Data Highlights
Metric
PAM
MedSAM
SegVol
Average DSC Improvement
+19.3%
Baseline
Baseline
User Interaction Time Reduction
63.6%
Baseline
Baseline
Performance Stability (Prompt Variations)
P ≥ 0.5985
Not reported
Not reported
Performance Stability (Propagation Settings)
P ≥ 0.6131
Not reported
Not reported
Inference Speed
Faster (P < 0.001)
Slower
Slower
Key Findings
PAM integrates a CNN-based UNet for intra-slice feature extraction with Transformer attention for inter-slice propagation, capturing structural and semantic continuity.
Across 44 diverse datasets, PAM improved average Dice similarity coefficient by 19.3% compared to MedSAM and SegVol.
PAM demonstrated stable segmentation performance despite variations in user prompts and propagation parameters.
User interaction time was reduced by 63.6%, enhancing clinical workflow efficiency.
Inference with PAM was significantly faster than competing models (P < 0.001).
Performance gains were most pronounced for irregularly shaped objects, with improvements negatively correlated with object regularity (r < -0.1249).
Clinical Implications
PAM offers a practical solution for generating accurate 3D segmentations from minimal user input, reducing the burden of manual annotation and task-specific retraining. Its robustness across diverse imaging modalities and object types supports broader clinical applicability, potentially accelerating diagnosis, treatment planning, and monitoring. The reduced interaction time and faster inference facilitate integration into routine clinical workflows.
Conclusion
By combining intra-slice CNN features with inter-slice Transformer propagation, PAM effectively addresses volumetric segmentation challenges in medical imaging. It provides a generalizable, efficient, and user-friendly tool that outperforms existing methods, advancing automated 3D segmentation in clinical practice.
References
Kirillov et al. 2023 -- Segment Anything Model (SAM)