Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

PanoGS: Gaussian-based Panoptic Segmentation for 3D Open Vocabulary Scene Understanding

About

Recently, 3D Gaussian Splatting (3DGS) has shown encouraging performance for open vocabulary scene understanding tasks. However, previous methods cannot distinguish 3D instance-level information, which usually predicts a heatmap between the scene feature and text query. In this paper, we propose PanoGS, a novel and effective 3D panoptic open vocabulary scene understanding approach. Technically, to learn accurate 3D language features that can scale to large indoor scenarios, we adopt the pyramid tri-plane to model the latent continuous parametric feature space and use a 3D feature decoder to regress the multi-view fused 2D feature cloud. Besides, we propose language-guided graph cuts that synergistically leverage reconstructed geometry and learned language cues to group 3D Gaussian primitives into a set of super-primitives. To obtain 3D consistent instance, we perform graph clustering based segmentation with SAM-guided edge affinity computation between different super-primitives. Extensive experiments on widely used datasets show better or more competitive performance on 3D panoptic open vocabulary scene understanding. Project page: \href{https://zju3dv.github.io/panogs}{https://zju3dv.github.io/panogs}.

Hongjia Zhai, Hai Li, Zhenzhe Li, Xiaokun Pan, Yijia He, Guofeng Zhang• 2025

Related benchmarks

TaskDatasetResultRank
3D Semantic SegmentationScanNet V2 (val)
mIoU50.72
209
3D Semantic SegmentationReplica
3D mIoU54.98
41
3D Semantic SegmentationScanNet V2
mIoU50.72
16
3D Semantic SegmentationReplica (test)
mIoU (All)54.98
10
3D Panoptic SegmentationScanNet V2
PRQ (Things)33.84
7
3D Panoptic SegmentationScanNet V2 (val)
PRQ (Thing)49.26
6
3D Panoptic SegmentationReplica
PRQ (Thing)43.04
5
3D Panoptic SegmentationReplica (test)
PRQ (Thing)43.04
5
Showing 8 of 8 rows

Other info

Code

Follow for update