Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Recurrent MVSNet for High-resolution Multi-view Stereo Depth Inference

About

Deep learning has recently demonstrated its excellent performance for multi-view stereo (MVS). However, one major limitation of current learned MVS approaches is the scalability: the memory-consuming cost volume regularization makes the learned MVS hard to be applied to high-resolution scenes. In this paper, we introduce a scalable multi-view stereo framework based on the recurrent neural network. Instead of regularizing the entire 3D cost volume in one go, the proposed Recurrent Multi-view Stereo Network (R-MVSNet) sequentially regularizes the 2D cost maps along the depth direction via the gated recurrent unit (GRU). This reduces dramatically the memory consumption and makes high-resolution reconstruction feasible. We first show the state-of-the-art performance achieved by the proposed R-MVSNet on the recent MVS benchmarks. Then, we further demonstrate the scalability of the proposed method on several large-scale scenarios, where previous learned approaches often fail due to the memory constraint. Code is available at https://github.com/YoYo000/MVSNet.

Yao Yao, Zixin Luo, Shiwei Li, Tianwei Shen, Tian Fang, Long Quan• 2019

Related benchmarks

TaskDatasetResultRank
Multi-view StereoTanks and Temples Intermediate set
Mean F1 Score50.55
110
Multi-view StereoTanks & Temples Advanced
Mean F-score29.55
71
Multi-view StereoDTU (test)
Accuracy38.5
61
Multi-view StereoDTU 1 (evaluation)
Accuracy Error (mm)0.383
51
Multi-view StereoTanks&Temples
Family73.01
46
Multi-view StereoTanks & Temples Intermediate
F-score48.4
43
Multi-view StereoTanks & Temples Advanced
F-score24.91
36
Multi-view Stereo ReconstructionDTU (evaluation)
Mean Distance (mm) - Acc.0.383
35
Multi-view StereoTanks and Temples (Advanced set)
Aud. Error19.49
28
Point Cloud ReconstructionDTU high-resolution (test)
Accuracy38.3
16
Showing 10 of 11 rows

Other info

Follow for update