Concordance among experts in assessing apical mucosal preservation during holmium laser enucleation of the prostate (HoLEP): implications for artificial intelligence model development

By
Archan Khandekar
Aravindh Rathinam
Ansh Bhatia
Diana M. Lopategui
Jonathan Katz
Roger L. Sur
Nicholas Smith
Pankaj N. Maheshwari
Hemendra N. Shah
December 4, 2025
0 min

World Journal Of Urology

Overview

This study assessed interrater reliability among six expert urologists rating apical mucosal preservation in HoLEP surgical videos and explored associations with postoperative urinary continence. Moderate agreement was observed, supporting the feasibility of using expert-labeled data to train AI models for intraoperative assessment.

Background

Holmium laser enucleation of the prostate (HoLEP) is a standard surgical treatment for benign prostatic hyperplasia, but transient stress urinary incontinence (TSUI) remains a common early postoperative complication. Technical refinements such as improved apical mucosal preservation have reduced TSUI rates. Artificial intelligence (AI) offers potential to automate intraoperative assessments, but requires reliable expert annotations to train models accurately. This study evaluates expert concordance in classifying apical mucosal preservation from surgical videos as a foundation for AI development.

Data Highlights

Metric	Value
Number of videos reviewed	60
Number of expert raters	6
Rating scale	3-tier (completely preserved, partially preserved, not preserved)
Postoperative continence follow-up	6 weeks
Interrater agreement measure	Fleiss’ multi-rater Kappa (κ)

Key Findings

Six expert urologists independently rated apical mucosal preservation on a 3-level scale from anonymized HoLEP videos.
Interrater reliability showed moderate agreement, with Fleiss’ multi-rater Kappa indicating acceptable concordance for clinical annotation.
Ratings were correlated with patient-reported urinary continence outcomes at 6 weeks postoperatively.
Majority vote consensus ratings were used to define ground truth for potential AI model training.
No formal training or consensus meetings were conducted prior to rating, reflecting natural variability in expert judgment.

Clinical Implications

The moderate interrater agreement supports the feasibility of developing AI algorithms to assess apical mucosal preservation intraoperatively during HoLEP. Reliable expert annotations are critical to train models that can predict postoperative continence outcomes and potentially guide surgical technique or postoperative management. Incorporating standardized rating criteria and consensus-building may further improve annotation quality for AI applications.

Conclusion

Expert urologists demonstrate moderate concordance in visually classifying apical mucosal preservation in HoLEP videos, providing a viable foundation for supervised AI model development. This approach may enhance intraoperative assessment and prognostication of urinary continence outcomes.

References

1 -- Holmium laser enucleation of the prostate (HoLEP) as treatment for BPH
2 -- Transient stress urinary incontinence rates post-HoLEP
3,4,5 -- Technical modifications reducing TSUI rates
6 -- Randomized trial comparing en-bloc and lobe-by-lobe HoLEP techniques
7,8 -- AI applications in surgical step recognition
9,10 -- AI in urology: surgical phase recognition and Gleason grading
11 -- Importance of reliable expert labeling for AI model accuracy
12 -- Standardized en-bloc HoLEP surgical technique
13 -- Interpretation benchmarks for Cohen’s Kappa
14 -- Fleiss’ multi-rater Kappa for interrater reliability

Concordance among experts in assessing apical mucosal preservation during holmium laser enucleation of the prostate (HoLEP): implications for artificial intelligence model development

Expert Agreement on Apical Mucosal Preservation Evaluation in HoLEP Videos

Overview

Background

Data Highlights

Key Findings

Clinical Implications

Conclusion

References

Original Source(s)

Concordance among experts in assessing apical mucosal preservation during holmium laser enucleation of the prostate (HoLEP): implications for artificial intelligence model development

Related Content

Forecasting Surgical Duration in Pediatric Urology Using a Deep Learning Model That Integrates Multimodal Patient and Physician Data

Anatomical and intraoperative predictors of surgical complications after distal hypospadias repair with foreskin reconstruction: a prospective study

Renal arterial pseudoaneurysm after robotic-assisted partial nephrectomy: a single-center analysis