Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Visibility-aware Multi-view Stereo Network

About

Learning-based multi-view stereo (MVS) methods have demonstrated promising results. However, very few existing networks explicitly take the pixel-wise visibility into consideration, resulting in erroneous cost aggregation from occluded pixels. In this paper, we explicitly infer and integrate the pixel-wise occlusion information in the MVS network via the matching uncertainty estimation. The pair-wise uncertainty map is jointly inferred with the pair-wise depth map, which is further used as weighting guidance during the multi-view cost volume fusion. As such, the adverse influence of occluded pixels is suppressed in the cost fusion. The proposed framework Vis-MVSNet significantly improves depth accuracies in the scenes with severe occlusion. Extensive experiments are performed on DTU, BlendedMVS, and Tanks and Temples datasets to justify the effectiveness of the proposed framework.

Jingyang Zhang, Yao Yao, Shiwei Li, Zixin Luo, Tian Fang• 2020

Related benchmarks

TaskDatasetResultRank
Multi-view StereoTanks and Temples Intermediate set
Mean F1 Score60.03
110
Multi-view StereoTanks & Temples Advanced
Mean F-score0.3378
71
Multi-view StereoDTU (test)
Accuracy36.9
61
Multi-view StereoDTU 1 (evaluation)
Accuracy Error (mm)0.369
51
Multi-view StereoTanks & Temples Intermediate
F-score60.03
43
Multi-view StereoTanks & Temples Advanced
F-score33.78
36
Multi-view StereoTanks and Temples (Advanced set)
Aud. Error20.79
28
Point Cloud ReconstructionDTU high-resolution (test)
Accuracy36.9
16
Multi-view StereoBlendedMVS (val)
EPE1.47
13
Showing 9 of 9 rows

Other info

Follow for update