Neural Volumetric Object Selection
About
We introduce an approach for selecting objects in neural volumetric 3D representations, such as multi-plane images (MPI) and neural radiance fields (NeRF). Our approach takes a set of foreground and background 2D user scribbles in one view and automatically estimates a 3D segmentation of the desired object, which can be rendered into novel views. To achieve this result, we propose a novel voxel feature embedding that incorporates the neural volumetric 3D representation and multi-view image features from all input views. To evaluate our approach, we introduce a new dataset of human-provided segmentation masks for depicted objects in real-world multi-view scene captures. We show that our approach out-performs strong baselines, including 2D segmentation and 3D segmentation approaches adapted to our task.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| 2D mask segmentation | LLFF (val) | Accuracy92 | 9 | |
| 2D mask segmentation | Shiny (val) | Accuracy0.907 | 9 | |
| 3D Instance Segmentation | NVOS | mIoU70.1 | 4 | |
| Novel-view object rendering | LLFF (val) | SSIM76.7 | 2 | |
| Novel-view object rendering | Shiny (val) | SSIM0.612 | 2 |