Matryoshka Networks: Predicting 3D Geometry via Nested Shape Layers
About
In this paper, we develop novel, efficient 2D encodings for 3D geometry, which enable reconstructing full 3D shapes from a single image at high resolution. The key idea is to pose 3D shape reconstruction as a 2D prediction problem. To that end, we first develop a simple baseline network that predicts entire voxel tubes at each pixel of a reference view. By leveraging well-proven architectures for 2D pixel-prediction tasks, we attain state-of-the-art results, clearly outperforming purely voxel-based approaches. We scale this baseline to higher resolutions by proposing a memory-efficient shape encoding, which recursively decomposes a 3D shape into nested shape layers, similar to the pieces of a Matryoshka doll. This allows reconstructing highly detailed shapes with complex topology, as demonstrated in extensive experiments; we clearly outperform previous octree-based approaches despite having a much simpler architecture using standard network components. Our Matryoshka networks further enable reconstructing shapes from IDs or shape similarity, as well as shape sampling.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| 3D Object Reconstruction | ShapeNet (test) | Mean IoU0.635 | 80 | |
| 3D Object Reconstruction | ShapeNet Cars (test) | IoU79.4 | 20 | |
| 3D Reconstruction | ShapeNet | mIoU (Car)85 | 17 | |
| Single-image 3D Reconstruction | ShapeNetCore (test) | mIoU63.4 | 11 | |
| 3D Object Reconstruction | Things3D | mIoU (chair)0.399 | 10 | |
| Single-view 3D Object Reconstruction | ShapeNet (test) | Airplane0.446 | 10 | |
| Single-view 3D Object Reconstruction | Things3D (test) | F-Score@1% (chair)23.1 | 10 | |
| Single-view 3D Object Reconstruction | ShapeNetCore (Unseen categories) | mIoU0.299 | 8 | |
| Single-view 3D Object Reconstruction | ShapeNet | Params (M)45.66 | 4 |