SAMa: Material-aware 3D Selection and Segmentation
About
Decomposing 3D assets into material parts is a common task for artists, yet remains a highly manual process. In this work, we introduce Select Any Material (SAMa), a material selection approach for in-the-wild objects in arbitrary 3D representations. Building on SAM2's video prior, we construct a material-centric video dataset that extends it to the material domain. We propose an efficient way to lift the model's 2D predictions to 3D by projecting each view into an intermediary 3D point cloud using depth. Nearest-neighbor lookups between any 3D representation and this similarity point cloud allow us to efficiently reconstruct accurate selection masks over objects' surfaces that can be inspected from any view. Our method is multiview-consistent by design, alleviating the need for costly per-asset optimization, and performs optimization-free selection in seconds. SAMa outperforms several strong baselines in selection accuracy and multiview consistency and enables various compelling applications, such as replacing the diffuse-textured materials on a text-to-3D output with PBR materials or selecting and editing materials on NeRFs and 3DGS captures.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Material Selection | NeRF | mIoU48 | 4 | |
| Material Selection | MIPNeRF-360 | mIoU60 | 4 | |
| Material Selection | Our Dataset (test) | mIoU69 | 4 | |
| Multiview Consistency | NeRF unseen views 49 (test) | Hamming Distance2.2 | 4 | |
| Multiview Consistency | Object-centric (test) | Hamming Distance (x100)1.7 | 4 | |
| Robustness | NeRF unseen views 49 (test) | Hamming Distance (x100)1.1 | 4 | |
| Robustness | MIPNeRF-360 unseen views 2 (test) | Hamming Distance (x100)1.2 | 4 | |
| Robustness | Object-centric (test) | Hamming Distance (x100)0.3 | 4 | |
| Multiview Consistency | MIPNeRF-360 unseen views 2 (test) | Hamming Distance0.014 | 4 |