Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Self-Supervised Monocular Depth Estimation: Solving the Dynamic Object Problem by Semantic Guidance

About

Self-supervised monocular depth estimation presents a powerful method to obtain 3D scene information from single camera images, which is trainable on arbitrary image sequences without requiring depth labels, e.g., from a LiDAR sensor. In this work we present a new self-supervised semantically-guided depth estimation (SGDepth) method to deal with moving dynamic-class (DC) objects, such as moving cars and pedestrians, which violate the static-world assumptions typically made during training of such models. Specifically, we propose (i) mutually beneficial cross-domain training of (supervised) semantic segmentation and self-supervised depth estimation with task-specific network heads, (ii) a semantic masking scheme providing guidance to prevent moving DC objects from contaminating the photometric loss, and (iii) a detection method for frames with non-moving DC objects, from which the depth of DC objects can be learned. We demonstrate the performance of our method on several benchmarks, in particular on the Eigen split, where we exceed all baselines without test-time refinement.

Marvin Klingner, Jan-Aike Term\"ohlen, Jonas Mikolajczyk, Tim Fingscheidt• 2020

Related benchmarks

TaskDatasetResultRank
Monocular Depth EstimationKITTI (Eigen)
Abs Rel0.107
502
Depth EstimationKITTI (Eigen split)
RMSE4.468
276
Monocular Depth EstimationKITTI (Eigen split)
Abs Rel0.107
193
Monocular Depth EstimationKITTI
Abs Rel0.107
161
Monocular Depth EstimationKITTI Raw Eigen (test)
RMSE4.693
159
Monocular Depth EstimationKITTI 2015 (Eigen split)
Abs Rel0.113
95
Semantic segmentationKITTI (val)
mIoU53.04
24
Semantic segmentationvKITTI 2
mIoU80.19
20
Semantic segmentationCityscapes
mIoU55.96
20
Semantic segmentationKITTI 2015
mIoU40.2
20
Showing 10 of 11 rows

Other info

Follow for update