DESC: Domain Adaptation for Depth Estimation via Semantic Consistency
About
Accurate real depth annotations are difficult to acquire, needing the use of special devices such as a LiDAR sensor. Self-supervised methods try to overcome this problem by processing video or stereo sequences, which may not always be available. Instead, in this paper, we propose a domain adaptation approach to train a monocular depth estimation model using a fully-annotated source dataset and a non-annotated target dataset. We bridge the domain gap by leveraging semantic predictions and low-level edge features to provide guidance for the target domain. We enforce consistency between the main model and a second model trained with semantic segmentation and edge maps, and introduce priors in the form of instance heights. Our approach is evaluated on standard domain adaptation benchmarks for monocular depth estimation and show consistent improvement upon the state-of-the-art.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Monocular Depth Estimation | Make3D (test) | Abs Rel0.393 | 132 | |
| Monocular Depth Estimation | KITTI 50m cap Eigen split (test) | Absolute Relative Error0.116 | 19 | |
| Depth Estimation | KITTI Eigen split 80m cap | Abs Rel Error0.122 | 18 | |
| Monocular Depth Estimation | KITTI Stereo 2015 (test) | Abs Rel0.085 | 9 |