Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Depth from a Single Image by Harmonizing Overcomplete Local Network Predictions

About

A single color image can contain many cues informative towards different aspects of local geometric structure. We approach the problem of monocular depth estimation by using a neural network to produce a mid-level representation that summarizes these cues. This network is trained to characterize local scene geometry by predicting, at every image location, depth derivatives of different orders, orientations and scales. However, instead of a single estimate for each derivative, the network outputs probability distributions that allow it to express confidence about some coefficients, and ambiguity about others. Scene depth is then estimated by harmonizing this overcomplete set of network predictions, using a globalization procedure that finds a single consistent depth map that best matches all the local derivative distributions. We demonstrate the efficacy of this approach through evaluation on the NYU v2 depth data set.

Ayan Chakrabarti, Jingyu Shao, Gregory Shakhnarovich• 2016

Related benchmarks

TaskDatasetResultRank
Depth EstimationNYU v2 (test)
Threshold Accuracy (delta < 1.25)80.6
423
Monocular Depth EstimationNYU v2 (test)
Abs Rel0.149
257
Depth EstimationNYU Depth V2
RMSE0.62
177
Depth PredictionNYU Depth V2 (test)
Accuracy (δ < 1.25)80.6
113
Monocular Depth EstimationNYU Depth Eigen v2 (test)
A.Rel0.149
49
Single-view depth estimationNYUv2 36 (test)
AbsRel0.149
21
Single-view depth estimationNYU official 654 images v2 (test)
AbsRel0.149
21
Showing 7 of 7 rows

Other info

Follow for update