Segment Any 3D Gaussians

About

This paper presents SAGA (Segment Any 3D GAussians), a highly efficient 3D promptable segmentation method based on 3D Gaussian Splatting (3D-GS). Given 2D visual prompts as input, SAGA can segment the corresponding 3D target represented by 3D Gaussians within 4 ms. This is achieved by attaching an scale-gated affinity feature to each 3D Gaussian to endow it a new property towards multi-granularity segmentation. Specifically, a scale-aware contrastive training strategy is proposed for the scale-gated affinity feature learning. It 1) distills the segmentation capability of the Segment Anything Model (SAM) from 2D masks into the affinity features and 2) employs a soft scale gate mechanism to deal with multi-granularity ambiguity in 3D segmentation through adjusting the magnitude of each feature channel according to a specified 3D physical scale. Evaluations demonstrate that SAGA achieves real-time multi-granularity segmentation with quality comparable to state-of-the-art methods. As one of the first methods addressing promptable segmentation in 3D-GS, the simplicity and effectiveness of SAGA pave the way for future advancements in this field. Our code will be released.

Jiazhong Cen, Jiemin Fang, Chen Yang, Lingxi Xie, Xiaopeng Zhang, Wei Shen, Qi Tian• 2023

Related benchmarks

Task	Dataset	Result
3D Semantic Segmentation	3D-OVS	Bed97.4	42
Open-Vocabulary 3D Segmentation	LERF-Mask (test)	Figurines mIoU90.7	19
3DGS Segmentation	NVOS 1.0 (test)	mIoU90.9	12
Segmentation	NVOS (test)	mIoU90.9	9
Open-Vocabulary Segmentation	SPIn-NeRF	mIoU93.7	8
4D Scene Segmentation	Neural3DV	mIoU (coffee_martini)22.01	8
Multi-view Promptable Segmentation	SPIn-NeRF	mIoU93.7	7
Open-Vocabulary Segmentation	NVOS	mIoU92.6	7
Single-object Reconstruction	NVOS	mIOU90.9	6
Multi-view Promptable Segmentation	NVOS	mIoU92.6	6

Showing 10 of 14 rows

Other info

Follow for update

@wizwand_team Discord