Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Self-supervised Monocular Trained Depth Estimation using Self-attention and Discrete Disparity Volume

About

Monocular depth estimation has become one of the most studied applications in computer vision, where the most accurate approaches are based on fully supervised learning models. However, the acquisition of accurate and large ground truth data sets to model these fully supervised methods is a major challenge for the further development of the area. Self-supervised methods trained with monocular videos constitute one the most promising approaches to mitigate the challenge mentioned above due to the wide-spread availability of training data. Consequently, they have been intensively studied, where the main ideas explored consist of different types of model architectures, loss functions, and occlusion masks to address non-rigid motion. In this paper, we propose two new ideas to improve self-supervised monocular trained depth estimation: 1) self-attention, and 2) discrete disparity prediction. Compared with the usual localised convolution operation, self-attention can explore a more general contextual information that allows the inference of similar disparity values at non-contiguous regions of the image. Discrete disparity prediction has been shown by fully supervised methods to provide a more robust and sharper depth estimation than the more common continuous disparity prediction, besides enabling the estimation of depth uncertainty. We show that the extension of the state-of-the-art self-supervised monocular trained depth estimator Monodepth2 with these two ideas allows us to design a model that produces the best results in the field in KITTI 2015 and Make3D, closing the gap with respect self-supervised stereo training and fully supervised approaches.

Adrian Johnston, Gustavo Carneiro• 2020

Related benchmarks

TaskDatasetResultRank
Monocular Depth EstimationKITTI (Eigen)
Abs Rel0.106
502
Depth EstimationKITTI (Eigen split)
RMSE4.699
276
Monocular Depth EstimationKITTI (Eigen split)
Abs Rel0.106
193
Monocular Depth EstimationKITTI
Abs Rel0.111
161
Monocular Depth EstimationKITTI Raw Eigen (test)
RMSE4.699
159
Monocular Depth EstimationKITTI 80m maximum depth (Eigen)
Abs Rel0.106
126
Monocular Depth EstimationKITTI 2015 (Eigen split)
Abs Rel0.106
95
Monocular Depth EstimationKITTI improved ground truth (Eigen split)
Abs Rel0.081
65
Depth PredictionKITTI original ground truth (test)
Abs Rel0.106
38
Depth EstimationKITTI improved ground truth 2015 (93% Eigen split)
Abs Rel0.081
32
Showing 10 of 16 rows

Other info

Follow for update