Self-Supervised Monocular Depth Estimation by Direction-aware Cumulative Convolution Network

About

Monocular depth estimation is known as an ill-posed task in which objects in a 2D image usually do not contain sufficient information to predict their depth. Thus, it acts differently from other tasks (e.g., classification and segmentation) in many ways. In this paper, we find that self-supervised monocular depth estimation shows a direction sensitivity and environmental dependency in the feature representation. But the current backbones borrowed from other tasks pay less attention to handling different types of environmental information, limiting the overall depth accuracy. To bridge this gap, we propose a new Direction-aware Cumulative Convolution Network (DaCCN), which improves the depth feature representation in two aspects. First, we propose a direction-aware module, which can learn to adjust the feature extraction in each direction, facilitating the encoding of different types of information. Secondly, we design a new cumulative convolution to improve the efficiency for aggregating important environmental information. Experiments show that our method achieves significant improvements on three widely used benchmarks, KITTI, Cityscapes, and Make3D, setting a new state-of-the-art performance on the popular benchmarks with all three types of self-supervision.

Wencheng Han, Junbo Yin, Jianbing Shen• 2023

Related benchmarks

Task	Dataset	Result
Monocular Depth Estimation	KITTI (Eigen)	Abs Rel0.099	552
Monocular Depth Estimation	KITTI (Eigen split)	Abs Rel0.099	215
Monocular Depth Estimation	KITTI Improved GT (Eigen)	AbsRel0.094	116
Depth Estimation	KITTI improved dense ground truth	Abs Rel0.094	29

Showing 4 of 4 rows

Other info

Follow for update

@wizwand_team Discord