Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Neural RGB->D Sensing: Depth and Uncertainty from a Video Camera

About

Depth sensing is crucial for 3D reconstruction and scene understanding. Active depth sensors provide dense metric measurements, but often suffer from limitations such as restricted operating ranges, low spatial resolution, sensor interference, and high power consumption. In this paper, we propose a deep learning (DL) method to estimate per-pixel depth and its uncertainty continuously from a monocular video stream, with the goal of effectively turning an RGB camera into an RGB-D camera. Unlike prior DL-based methods, we estimate a depth probability distribution for each pixel rather than a single depth value, leading to an estimate of a 3D depth probability volume for each input frame. These depth probability volumes are accumulated over time under a Bayesian filtering framework as more incoming frames are processed sequentially, which effectively reduces depth uncertainty and improves accuracy, robustness, and temporal stability. Compared to prior work, the proposed approach achieves more accurate and stable results, and generalizes better to new datasets. Experimental results also show the output of our approach can be directly fed into classical RGB-D based 3D scanning methods for 3D scene reconstruction.

Chao Liu, Jinwei Gu, Kihwan Kim, Srinivasa Narasimhan, Jan Kautz• 2019

Related benchmarks

TaskDatasetResultRank
Depth EstimationKITTI (Eigen split)
RMSE2.829
276
Monocular Depth EstimationKITTI (Eigen split)
Abs Rel0.1
193
Depth EstimationScanNet (test)
Abs Rel0.1013
65
2D Depth Estimation7 Scenes
Abs Rel0.1758
20
Depth Estimation7-Scenes (test)
Abs Rel0.2334
19
Multi-view Depth EstimationScanNet 16 (test)
Abs Rel Error0.1013
12
Depth EstimationKITTI (official split)
Absolute Relative Error0.1
10
Monocular Depth EstimationScanNet monocular variant 20 60-frame sequences
OPW0.043
7
Showing 8 of 8 rows

Other info

Follow for update