Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Digging Into Self-Supervised Monocular Depth Estimation

About

Per-pixel ground-truth depth data is challenging to acquire at scale. To overcome this limitation, self-supervised learning has emerged as a promising alternative for training models to perform monocular depth estimation. In this paper, we propose a set of improvements, which together result in both quantitatively and qualitatively improved depth maps compared to competing self-supervised methods. Research on self-supervised monocular training usually explores increasingly complex architectures, loss functions, and image formation models, all of which have recently helped to close the gap with fully-supervised methods. We show that a surprisingly simple model, and associated design choices, lead to superior predictions. In particular, we propose (i) a minimum reprojection loss, designed to robustly handle occlusions, (ii) a full-resolution multi-scale sampling method that reduces visual artifacts, and (iii) an auto-masking loss to ignore training pixels that violate camera motion assumptions. We demonstrate the effectiveness of each component in isolation, and show high quality, state-of-the-art results on the KITTI benchmark.

Cl\'ement Godard, Oisin Mac Aodha, Michael Firman, Gabriel Brostow• 2018

Related benchmarks

TaskDatasetResultRank
Monocular Depth EstimationKITTI (Eigen)
Abs Rel0.106
523
Depth EstimationNYU v2 (test)
Threshold Accuracy (delta < 1.25)77.1
435
Monocular Depth EstimationNYU v2 (test)
Abs Rel0.154
320
Depth EstimationKITTI (Eigen split)
RMSE4.577
291
Surface Normal EstimationNYU v2 (test)
Mean Angle Distance (MAD)43.8
224
Monocular Depth EstimationKITTI
Abs Rel0.115
220
Monocular Depth EstimationKITTI (Eigen split)
Abs Rel0.106
215
Monocular Depth EstimationNYU V2--
174
Monocular Depth EstimationKITTI Raw Eigen (test)
RMSE4.701
159
Depth EstimationKITTI
RMSE4.863
156
Showing 10 of 120 rows
...

Other info

Code

Follow for update