SHADeS: self-supervised monocular depth estimation through non-Lambertian image decomposition - Report - MDSpire

SHADeS: self-supervised monocular depth estimation through non-Lambertian image decomposition

  • By

  • Rema Daher

  • Francisco Vasconcelos

  • Danail Stoyanov

  • May 13, 2025

  • 0 min

Share

Self-Supervised Monocular Depth Estimation via Non-Lambertian Image Decomposition in SHADeS

Overview

This study introduces SHADeS, a novel self-supervised monocular depth estimation framework that improves robustness to specular reflections in endoscopic images by decomposing images into albedo, shading, and specular reflection components. The method outperforms state-of-the-art approaches on real and phantom colonoscopy datasets, enabling better depth estimation and implicit specularity segmentation.

Background

Colorectal cancer is a leading cause of cancer-related mortality worldwide, with early detection significantly improving survival rates. Colonoscopy is the primary diagnostic tool but suffers from visibility challenges due to reflections and lighting variations in the endoscopic environment. Monocular depth estimation is critical for 3D reconstruction and navigation during endoscopy, yet existing methods struggle with non-Lambertian surfaces typical of wet, reflective tissue. Self-supervised learning approaches have advanced the field but remain limited by specular highlights that degrade depth estimation accuracy.

Data Highlights

MethodDatasetPerformance
SHADeSHyper Kvasir (real)Improved robustness to specular reflections
SHADeSC3VD (phantom)Superior depth estimation accuracy
IID-SfMLearnerVariousAlbedo extraction with specular artefacts
Monodepth2, MonoViTEndoscopy dataSuboptimal under specular conditions

Key Findings

  • SHADeS extends the Lambertian image decomposition model by adding a specular reflection component, enabling the relation I = AS + M.
  • The model jointly estimates depth, albedo, shading, and specular reflections, improving separation of true tissue color from reflections.
  • SHADeS effectively removes specular artefacts from albedo maps, unlike previous methods such as IID-SfMLearner.
  • The framework implicitly produces specularity segmentation masks and can generate inpainted images free of specular highlights.
  • SHADeS demonstrates superior monocular depth estimation performance on both real (Hyper Kvasir) and phantom (C3VD) colonoscopy datasets compared to SOTA methods.

Clinical Implications

Improved monocular depth estimation in endoscopy can enhance 3D reconstruction, aiding in the detection of missed lesions and better characterization of polyps. By effectively handling specular reflections, SHADeS can provide clearer visualization and more reliable navigation during colonoscopy. This advancement may contribute to higher early detection rates of colorectal cancer and improved training tools for endoscopists.

Conclusion

SHADeS represents a significant advancement in self-supervised monocular depth estimation for endoscopy by addressing the challenges posed by non-Lambertian surfaces and specular reflections. Its ability to jointly estimate depth and decompose image components enhances robustness and accuracy, with promising applications in colorectal cancer diagnosis and endoscopic navigation.

References

  1. IID-SfMLearner (Zhou et al. 2017) -- Unsupervised Learning of Depth and Ego-Motion from Video
  2. MonoViT (Zhang et al. 2023) -- Transformer-based Monocular Depth Estimation
  3. Monodepth2 (Godard et al. 2019) -- Self-Supervised Monocular Depth Estimation with Improved Losses
  4. Hyper Kvasir Dataset -- Real Endoscopy Data
  5. C3VD Dataset -- Phantom Colonoscopy Data

Original Source(s)

Related Content