Random features meet MIL: a deep GP approach to colorectal MSI prediction

By
Shixuan Shen
Zeyang Wang
Tianmu Liu
Kangle Ma
Zhen Tian
Fuqiang Zhang
Qingyue Zhang
December 15, 2025
0 min

Npj Digital Medicine

At a Glance

Category	Detail
Condition	Colorectal cancer (CRC)
Key Mechanisms	Integration of deep Gaussian processes with multi-instance learning and attention-based aggregation to handle weak supervision and improve classification from whole-slide histopathological images
Target Population	Patients undergoing colorectal cancer diagnosis via histopathological imaging
Care Setting	Clinical diagnostic imaging and pathology laboratories

Key Highlights

Proposed model (DGP-RF) achieves superior classification performance (AUC 0.895) compared to ResNet, EfficientNet, and ShuffleNet on TCGA-CRC dataset
Model effectively handles weakly labeled data using multi-instance learning with bag-level labels, avoiding need for costly instance-level annotations
Attention-based aggregation enhances interpretability by focusing on key regions within whole-slide images, supporting clinical decision-making

Guideline-Based Recommendations

Diagnosis

Utilize deep learning models that incorporate multi-instance learning to manage weak supervision in histopathological image analysis
Apply attention mechanisms to highlight diagnostically relevant regions in whole-slide images for improved interpretability

Management

Incorporate scalable and robust machine learning frameworks like deep Gaussian processes with random feature expansion for colorectal cancer classification
Leverage models that can handle large-scale datasets efficiently without requiring extensive manual annotation

Monitoring & Follow-up

Monitor model performance using metrics such as area under the curve (AUC) to ensure diagnostic accuracy and robustness
Evaluate model interpretability to facilitate clinical acceptance and ongoing validation

Risks

Be aware of potential limitations related to data heterogeneity and weak supervision in training datasets
Consider computational complexity and resource requirements when deploying attention-based deep learning models in clinical settings

Patient & Prescribing Data

Patients with colorectal cancer undergoing diagnostic evaluation via histopathological imaging

Automated and interpretable classification models can support early and accurate diagnosis, potentially improving patient outcomes by guiding timely treatment decisions

Clinical Best Practices

Employ weakly supervised learning approaches to maximize use of available labeled data while minimizing annotation burden
Integrate attention-based mechanisms to improve model transparency and clinician trust
Validate models on large, heterogeneous datasets to ensure generalizability and robustness
Combine advanced feature representation techniques such as random feature expansion with probabilistic models like deep Gaussian processes for enhanced classification performance

References

TCGA-CRC dataset

Random features meet MIL: a deep GP approach to colorectal MSI prediction

Clinical Scorecard: Integrating Random Features with Deep Gaussian Processes for Predicting Colorectal Cancer MSI

At a Glance

Key Highlights

Guideline-Based Recommendations

Diagnosis

Management

Monitoring & Follow-up

Risks

Patient & Prescribing Data

Clinical Best Practices

References

Original Source(s)

Random features meet MIL: a deep GP approach to colorectal MSI prediction

Related Content

Time to Treatment Discontinuation and Cost Effectiveness of Third-Line Therapies in Advanced Colorectal Cancer: Real-World Evidence from the NIH All of Us Research Program

Investigation of Euphorbia Lathyris L. Seed's Mechanisms in Colorectal Cancer Treatment Through Network Pharmacology and Molecular Docking Approaches

Clinical Characteristics and Epidemiological Patterns of Early-Onset Colorectal Cancer in Young Adults and Middle-Aged Individuals: Insights from SEER 17 and GBD 2021 Data Over the Last 15 Years