Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

DeepV2D: Video to Depth with Differentiable Structure from Motion

About

We propose DeepV2D, an end-to-end deep learning architecture for predicting depth from video. DeepV2D combines the representation ability of neural networks with the geometric principles governing image formation. We compose a collection of classical geometric algorithms, which are converted into trainable modules and combined into an end-to-end differentiable architecture. DeepV2D interleaves two stages: motion estimation and depth estimation. During inference, motion and depth estimation are alternated and converge to accurate depth. Code is available https://github.com/princeton-vl/DeepV2D.

Zachary Teed, Jia Deng• 2018

Related benchmarks

TaskDatasetResultRank
Depth EstimationKITTI (Eigen split)
RMSE2.483
276
Monocular Depth EstimationKITTI (test)
Abs Rel Error0.037
103
Depth EstimationScanNet (test)
Abs Rel0.057
65
Video Depth EstimationSintel (test)
Delta 1 Accuracy50.9
57
Visual-Inertial OdometryEuRoC (All sequences)
MH1 Error0.739
51
Camera pose estimationTUM freiburg1
Rotation Error0.105
34
Visual OdometryTUM-RGBD
freiburg1/xyz Error0.15
34
3D Reconstruction7 Scenes--
32
Video Depth EstimationKITTI (test)
Delta197.2
25
Video Depth EstimationVDW (test)
Delta 154.6
24
Showing 10 of 34 rows

Other info

Follow for update