In Depth We Trust: Reliable Monocular Depth Supervision for Gaussian Splatting

About

Using accurate depth priors in 3D Gaussian Splatting helps mitigate artifacts caused by sparse training data and textureless surfaces. However, acquiring accurate depth maps requires specialized acquisition systems. Foundation monocular depth estimation models offer a cost-effective alternative, but they suffer from scale ambiguity, multi-view inconsistency, and local geometric inaccuracies, which can degrade rendering performance when applied naively. This paper addresses the challenge of reliably leveraging monocular depth priors for Gaussian Splatting (GS) rendering enhancement. To this end, we introduce a training framework integrating scale-ambiguous and noisy depth priors into geometric supervision. We highlight the importance of learning from weakly aligned depth variations. We introduce a method to isolate ill-posed geometry for selective monocular depth regularization, restricting the propagation of depth inaccuracies into well-reconstructed 3D structures. Extensive experiments across diverse datasets show consistent improvements in geometric accuracy, leading to more faithful depth estimation and higher rendering quality across different GS variants and monocular depth backbones tested.

Wenhui Xiao, Ethan Goan, Rodrigo Santa Cruz, David Ahmedt-Aristizabal, Olivier Salvado, Clinton Fookes, Leo Lebrat• 2026

Related benchmarks

Task	Dataset	Result
Novel View Synthesis	ScanNet++	PSNR24.53	93
Depth Estimation	ScanNet++	AbsRel0.108	40
Novel View Synthesis	TanksAndTemples Low Data	PSNR20.578	9
Novel View Synthesis	TanksAndTemples Moderate Data	PSNR23.414	9
Novel View Synthesis	MipNeRF 360 Low Data	PSNR22.253	9
Novel View Synthesis	MipNeRF 360 Moderate Data	PSNR25.716	9

Showing 6 of 6 rows

Other info

Follow for update

@wizwand_team Discord