Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

LocalBins: Improving Depth Estimation by Learning Local Distributions

About

We propose a novel architecture for depth estimation from a single image. The architecture itself is based on the popular encoder-decoder architecture that is frequently used as a starting point for all dense regression tasks. We build on AdaBins which estimates a global distribution of depth values for the input image and evolve the architecture in two ways. First, instead of predicting global depth distributions, we predict depth distributions of local neighborhoods at every pixel. Second, instead of predicting depth distributions only towards the end of the decoder, we involve all layers of the decoder. We call this new architecture LocalBins. Our results demonstrate a clear improvement over the state-of-the-art in all metrics on the NYU-Depth V2 dataset. Code and pretrained models will be made publicly available.

Shariq Farooq Bhat, Ibraheem Alhashim, Peter Wonka• 2022

Related benchmarks

TaskDatasetResultRank
Depth EstimationNYU v2 (test)
Threshold Accuracy (delta < 1.25)90.7
423
Monocular Depth EstimationKITTI
Abs Rel0.133
161
Monocular Depth EstimationDDAD (test)
RMSE8.139
122
Depth EstimationSUN RGB-D (test)
Root Mean Square Error (RMS)0.47
93
Monocular Depth EstimationNYU-Depth v2 (official)
Abs Rel0.099
75
Depth EstimationiBims 1 (test)
REL0.21
41
Monocular Depth EstimationiBIMS-1
ARel0.296
32
Monocular Depth EstimationDiode Indoor (test)
A.Rel0.412
25
Monocular Depth EstimationVirtual KITTI 2 (test)
Delta 1 Acc81
22
Monocular Depth EstimationSUN RGB-D
Absolute Relative Error (Abs Rel)0.156
19
Showing 10 of 21 rows

Other info

Code

Follow for update