Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

MegaDepth: Learning Single-View Depth Prediction from Internet Photos

About

Single-view depth prediction is a fundamental problem in computer vision. Recently, deep learning methods have led to significant progress, but such methods are limited by the available training data. Current datasets based on 3D sensors have key limitations, including indoor-only images (NYU), small numbers of training examples (Make3D), and sparse sampling (KITTI). We propose to use multi-view Internet photo collections, a virtually unlimited data source, to generate training data via modern structure-from-motion and multi-view stereo (MVS) methods, and present a large depth dataset called MegaDepth based on this idea. Data derived from MVS comes with its own challenges, including noise and unreconstructable objects. We address these challenges with new data cleaning methods, as well as automatically augmenting our data with ordinal depth relations generated using semantic segmentation. We validate the use of large amounts of Internet data by showing that models trained on MegaDepth exhibit strong generalization-not only to novel scenes, but also to other diverse datasets including Make3D, KITTI, and DIW, even when no images from those datasets are seen during training.

Zhengqi Li, Noah Snavely• 2018

Related benchmarks

TaskDatasetResultRank
Monocular Depth EstimationKITTI
Abs Rel0.201
161
Depth EstimationScanNet
AbsRel19
94
Depth EstimationKITTI
AbsRel39.1
92
Monocular Depth EstimationScanNet
AbsRel26
64
Depth EstimationDIODE
Delta-1 Accuracy71.2
62
Depth PredictionETH3D
AbsRel39.8
35
Depth PredictionSintel
AbsRel39.8
32
2D Depth EstimationScanNet
AbsRel26
26
Monocular Depth EstimationMVSEC outdoor night1 (test)
Abs Error2.54
21
Monocular Depth EstimationMVSEC outdoor day1 (test)
Abs Depth Error2.37
21
Showing 10 of 19 rows

Other info

Follow for update