SepsisDRM: An Integrated Embedding Model for Comprehensive Sepsis Data Representation
Overview
SepsisDRM is a novel embedding model that integrates both tabular and clinical text data to represent sepsis patient information comprehensively. Trained on 19,526 patients, it stratifies patients into four phenotypes and predicts 28-day outcomes with high accuracy across multiple datasets.
Background
Sepsis remains a critical clinical challenge with high morbidity and mortality worldwide. Traditional sepsis models often rely solely on tabular data and are limited by small labeled datasets and task-specific designs. Clinical text contains valuable information that is frequently underutilized in sepsis research. Integrating multimodal data sources could enhance patient stratification and outcome prediction, potentially improving clinical decision-making.
Data Highlights
Dataset
Number of Patients
28-day Outcome Prediction AUC
Training Dataset
19,526
Not applicable
Retrospective Test
Not specified
0.92
Prospective Test
Not specified
0.94
External Validation (SYSMH)
Not specified
0.78
Key Findings
SepsisDRM jointly processes tabular and textual clinical data to create comprehensive patient embeddings.
It was trained on a large cohort of 19,526 sepsis patients, enhancing model robustness.
The model stratifies patients into four clinically interpretable sepsis phenotypes.
Achieved high predictive performance for 28-day outcomes with AUCs of 0.92 (retrospective), 0.94 (prospective), and 0.78 (external validation).
Demonstrated strong generalization across diverse sepsis-related tasks without requiring task-specific tuning.
Represents the first embedding model specifically developed for sepsis integrating multimodal data.
Clinical Implications
SepsisDRM offers a promising tool for improved patient stratification and outcome prediction by leveraging both structured and unstructured clinical data. Its robust performance across datasets suggests potential utility in diverse clinical settings to guide personalized management strategies. The model's open-source availability facilitates reproducibility and future research integration.
Conclusion
SepsisDRM establishes a new paradigm in sepsis research by integrating multimodal data into a unified embedding framework, enabling accurate phenotype stratification and outcome prediction. This approach may inform better clinical decision-making and inspire similar models in other complex diseases.
References
Evans et al. 2021 -- Surviving sepsis campaign: International guidelines for management of sepsis and septic shock 2021
Levy et al. 2003 -- 2001 SCCM/ESICM/ACCP/ATS/SIS International Sepsis Definitions Conference
Rudd et al. 2020 -- Global, regional, and national sepsis incidence and mortality, 1990–2017
Seymour et al. 2019 -- Derivation, validation, and potential treatment implications of novel clinical phenotypes for sepsis
Moor et al. 2023 -- Foundation models for generalist medical artificial intelligence