Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Deep Stereo using Adaptive Thin Volume Representation with Uncertainty Awareness

About

We present Uncertainty-aware Cascaded Stereo Network (UCS-Net) for 3D reconstruction from multiple RGB images. Multi-view stereo (MVS) aims to reconstruct fine-grained scene geometry from multi-view images. Previous learning-based MVS methods estimate per-view depth using plane sweep volumes with a fixed depth hypothesis at each plane; this generally requires densely sampled planes for desired accuracy, and it is very hard to achieve high-resolution depth. In contrast, we propose adaptive thin volumes (ATVs); in an ATV, the depth hypothesis of each plane is spatially varying, which adapts to the uncertainties of previous per-pixel depth predictions. Our UCS-Net has three stages: the first stage processes a small standard plane sweep volume to predict low-resolution depth; two ATVs are then used in the following stages to refine the depth with higher resolution and higher accuracy. Our ATV consists of only a small number of planes; yet, it efficiently partitions local depth ranges within learned small intervals. In particular, we propose to use variance-based uncertainty estimates to adaptively construct ATVs; this differentiable process introduces reasonable and fine-grained spatial partitioning. Our multi-stage framework progressively subdivides the vast scene space with increasing depth resolution and precision, which enables scene reconstruction with high completeness and accuracy in a coarse-to-fine fashion. We demonstrate that our method achieves superior performance compared with state-of-the-art benchmarks on various challenging datasets.

Shuo Cheng, Zexiang Xu, Shilin Zhu, Zhuwen Li, Li Erran Li, Ravi Ramamoorthi, Hao Su• 2019

Related benchmarks

TaskDatasetResultRank
Multi-view StereoTanks and Temples Intermediate set
Mean F1 Score54.83
110
Depth EstimationScanNet (test)
Abs Rel0.0845
65
Multi-view StereoDTU (test)
Accuracy33.8
61
Multi-view StereoDTU 1 (evaluation)
Accuracy Error (mm)0.338
51
Multi-view StereoTanks&Temples
Family76.09
46
Multi-view StereoTanks & Temples Intermediate
F-score54.83
43
Multi-view Stereo ReconstructionDTU (evaluation)
Mean Distance (mm) - Acc.0.338
35
Depth Estimation7-Scenes (test)
Abs Rel0.2113
19
Point Cloud ReconstructionDTU high-resolution (test)
Accuracy33.8
16
Point Cloud ReconstructionDTU (test)
Accuracy33.8
15
Showing 10 of 10 rows

Other info

Follow for update