SC-DepthV3: Robust Self-supervised Monocular Depth Estimation for Dynamic Scenes

About

Self-supervised monocular depth estimation has shown impressive results in static scenes. It relies on the multi-view consistency assumption for training networks, however, that is violated in dynamic object regions and occlusions. Consequently, existing methods show poor accuracy in dynamic scenes, and the estimated depth map is blurred at object boundaries because they are usually occluded in other training views. In this paper, we propose SC-DepthV3 for addressing the challenges. Specifically, we introduce an external pretrained monocular depth estimation model for generating single-image depth prior, namely pseudo-depth, based on which we propose novel losses to boost self-supervised training. As a result, our model can predict sharp and accurate depth maps, even when training from monocular videos of highly-dynamic scenes. We demonstrate the significantly superior performance of our method over previous methods on six challenging datasets, and we provide detailed ablation studies for the proposed terms. Source code and data will be released at https://github.com/JiawangBian/sc_depth_pl

Libo Sun, Jia-Wang Bian, Huangying Zhan, Wei Yin, Ian Reid, Chunhua Shen• 2022

Related benchmarks

Task	Dataset	Result
Depth Estimation	KITTI (Eigen split)	RMSE4.709	291
Monocular Depth Estimation	NYU V2	--	174
Monocular Depth Estimation	KITTI Raw (Eigen)	Abs Rel11.8	23
Monocular Depth Estimation	DDAD	Abs Rel Error0.142	21
Video Depth Estimation	NYUDV2 (Eigen split)	OPW Score0.441	15
Camera pose estimation	KITTI Odometry unified mapping protocol (Sequence 09)	ATE (m)23.174	9
Video Depth Estimation	KITTI (Eigen split)	Delta1 Acc86.4	9
Camera pose estimation	KITTI Odometry unified mapping protocol (Sequence 07)	ATE (m)27.224	8

Showing 8 of 8 rows

Other info

Follow for update

@wizwand_team Discord