Performance and clinical applicability of machine learning in liver computed tomography imaging: a systematic review

Category	Detail
Condition	Liver diseases requiring CT imaging for diagnosis and follow-up
Key Mechanisms	Application of machine learning (ML), especially deep learning (DL), to CT images for liver and lesion segmentation, detection, classification, and tissue characterization
Target Population	Patients undergoing liver CT imaging for diagnosis, treatment evaluation, or disease prediction
Care Setting	Radiology and hepatology clinical settings utilizing CT imaging

ML-based tools have shown promising results in liver segmentation, lesion detection, and classification on CT images.
Eighty-four studies focused on liver segmentation with DICE scores ranging from 0.75 to 0.9851, indicating high model performance.
Lesion segmentation models perform well for lesions >2 cm but struggle with lesions <1 cm, paralleling clinical challenges.

Use ML models as adjunct tools for liver and lesion segmentation on CT images to support radiological assessment.
Prefer models validated with external datasets and compared to human expert performance for clinical reliability.

Incorporate ML-based segmentation and classification to assist in treatment planning and post-treatment follow-up.
Utilize publicly available datasets like LiTS 2017 for model training and validation to ensure generalizability.

Monitor ML model performance using metrics such as DICE score, AUC-ROC, and accuracy with confidence intervals when available.
Regularly validate ML tools against clinical standards and human expert assessments.

Be aware of limitations in segmenting small lesions (<1 cm), which may affect diagnostic sensitivity.
Consider the risk of bias in studies with limited reporting transparency and lack of external validation.

Patients undergoing liver CT imaging for various clinical indications including lesion detection and liver disease evaluation

ML tools can enhance diagnostic accuracy and efficiency but should complement, not replace, expert radiological interpretation.

Adopt ML models with demonstrated external validation and comparison to human experts for clinical use.
Use standardized performance metrics such as DICE score to evaluate segmentation accuracy.
Recognize current ML limitations in detecting small lesions and integrate clinical judgment accordingly.
Encourage transparency in reporting ML model performance including confidence intervals or standard errors.
Leverage publicly available datasets to improve model training and reproducibility.

Clinical Scorecard: Evaluation of Machine Learning Efficacy and Clinical Relevance in Liver CT Imaging: A Systematic Review