IterMVS: Iterative Probability Estimation for Efficient Multi-View Stereo
About
We present IterMVS, a new data-driven method for high-resolution multi-view stereo. We propose a novel GRU-based estimator that encodes pixel-wise probability distributions of depth in its hidden state. Ingesting multi-scale matching information, our model refines these distributions over multiple iterations and infers depth and confidence. To extract the depth maps, we combine traditional classification and regression in a novel manner. We verify the efficiency and effectiveness of our method on DTU, Tanks&Temples and ETH3D. While being the most efficient method in both memory and run-time, our model achieves competitive performance on DTU and better generalization ability on Tanks&Temples as well as ETH3D than most state-of-the-art methods. Code is available at https://github.com/FangjinhuaWang/IterMVS.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Monocular Depth Estimation | DDAD (test) | RMSE7.95 | 122 | |
| Monocular Depth Estimation | KITTI (test) | Abs Rel Error0.057 | 103 | |
| Depth Estimation | ScanNet | AbsRel0.123 | 94 | |
| Multi-view Stereo | Tanks & Temples Advanced | Mean F-score33.24 | 71 | |
| Multi-view Stereo | DTU (test) | Accuracy37.3 | 61 | |
| Multi-view Stereo | DTU 1 (evaluation) | Accuracy Error (mm)0.373 | 51 | |
| Multi-view Stereo | Tanks&Temples | Family76.12 | 46 | |
| Multi-view Stereo | Tanks & Temples Intermediate | F-score56.94 | 43 | |
| Multi-view Stereo Reconstruction | ETH3D (test) | Accuracy84.73 | 41 | |
| Multi-view Stereo Reconstruction | ETH3D (train) | Accuracy79.79 | 41 |