SimpleRecon: 3D Reconstruction Without 3D Convolutions
About
Traditionally, 3D indoor scene reconstruction from posed images happens in two phases: per-image depth estimation, followed by depth merging and surface reconstruction. Recently, a family of methods have emerged that perform reconstruction directly in final 3D volumetric feature space. While these methods have shown impressive reconstruction results, they rely on expensive 3D convolutional layers, limiting their application in resource-constrained environments. In this work, we instead go back to the traditional route, and show how focusing on high quality multi-view depth prediction leads to highly accurate 3D reconstructions using simple off-the-shelf depth fusion. We propose a simple state-of-the-art multi-view depth estimator with two main contributions: 1) a carefully-designed 2D CNN which utilizes strong image priors alongside a plane-sweep feature volume and geometric losses, combined with 2) the integration of keyframe and geometric metadata into the cost volume which allows informed depth plane scoring. Our method achieves a significant lead over the current state-of-the-art for depth estimation and close or better for 3D reconstruction on ScanNet and 7-Scenes, yet still allows for online real-time low-memory reconstruction. Code, models and results are available at https://nianticlabs.github.io/simplerecon
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| 3D Scene Reconstruction | 7-Scenes (test) | Accuracy8 | 27 | |
| 3D Scene Reconstruction | ScanNet v2 (test) | Accuracy0.065 | 26 | |
| 3D Reconstruction | ScanNet | F-score68.3 | 13 | |
| Surface Reconstruction | ASE (val) | Accuracy53.9 | 10 | |
| Surface Reconstruction | ADT dataset | Accuracy32.6 | 10 | |
| 3D surface reconstruction | ScanNet V2 (testplanes) | Chamfer Distance5.4 | 7 | |
| 3D Scene Reconstruction | NYU 9 unseen scenes v2 (test) | C-l10.051 | 7 | |
| 3D Plane Reconstruction | ScanNet V2 (testplanes) | Fidelity9.42 | 7 | |
| 3D Plane Segmentation | ScanNet V2 (testplanes) | VOI2.507 | 7 | |
| 3D Mesh Reconstruction Quality Evaluation | ScanNet (test) | Aspe Mean0.436 | 6 |