Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

ZoeDepth: Zero-shot Transfer by Combining Relative and Metric Depth

About

This paper tackles the problem of depth estimation from a single image. Existing work either focuses on generalization performance disregarding metric scale, i.e. relative depth estimation, or state-of-the-art results on specific datasets, i.e. metric depth estimation. We propose the first approach that combines both worlds, leading to a model with excellent generalization performance while maintaining metric scale. Our flagship model, ZoeD-M12-NK, is pre-trained on 12 datasets using relative depth and fine-tuned on two datasets using metric depth. We use a lightweight head with a novel bin adjustment design called metric bins module for each domain. During inference, each input image is automatically routed to the appropriate head using a latent classifier. Our framework admits multiple configurations depending on the datasets used for relative depth pre-training and metric fine-tuning. Without pre-training, we can already significantly improve the state of the art (SOTA) on the NYU Depth v2 indoor dataset. Pre-training on twelve datasets and fine-tuning on the NYU Depth v2 indoor dataset, we can further improve SOTA for a total of 21% in terms of relative absolute error (REL). Finally, ZoeD-M12-NK is the first model that can jointly train on multiple datasets (NYU Depth v2 and KITTI) without a significant drop in performance and achieve unprecedented zero-shot generalization performance to eight unseen datasets from both indoor and outdoor domains. The code and pre-trained models are publicly available at https://github.com/isl-org/ZoeDepth .

Shariq Farooq Bhat, Reiner Birkl, Diana Wofk, Peter Wonka, Matthias M\"uller• 2023

Related benchmarks

TaskDatasetResultRank
Monocular Depth EstimationKITTI (Eigen)
Abs Rel0.054
502
Depth EstimationNYU v2 (test)
Threshold Accuracy (delta < 1.25)95.5
423
Depth EstimationKITTI (Eigen split)
RMSE2.29
276
Monocular Depth EstimationNYU v2 (test)
Abs Rel0.075
257
Monocular Depth EstimationKITTI (Eigen split)
Abs Rel0.054
193
Depth EstimationNYU Depth V2
RMSE0.27
177
Monocular Depth EstimationKITTI
Abs Rel0.06
161
Monocular Depth EstimationDDAD (test)
RMSE7.108
122
Monocular Depth EstimationETH3D
AbsRel0.5
117
Monocular Depth EstimationNYU V2
Delta 1 Acc95.5
113
Showing 10 of 114 rows
...

Other info

Code

Follow for update