Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

GeoGuide: Hierarchical Geometric Guidance for Open-Vocabulary 3D Semantic Segmentation

About

Open-vocabulary 3D semantic segmentation aims to segment arbitrary categories beyond the training set. Existing methods predominantly rely on distilling knowledge from 2D open-vocabulary models. However, aligning 3D features to the 2D representation space restricts intrinsic 3D geometric learning and inherits errors from 2D predictions. To address these limitations, we propose GeoGuide, a novel framework that leverages pretrained 3D models to integrate hierarchical geometry-semantic consistency for open-vocabulary 3D segmentation. Specifically, we introduce an Uncertainty-based Superpoint Distillation module to fuse geometric and semantic features for estimating per-point uncertainty, adaptively weighting 2D features within superpoints to suppress noise while preserving discriminative information to enhance local semantic consistency. Furthermore, our Instance-level Mask Reconstruction module leverages geometric priors to enforce semantic consistency within instances by reconstructing complete instance masks. Additionally, our Inter-Instance Relation Consistency module aligns geometric and semantic similarity matrices to calibrate cross-instance consistency for same-category objects, mitigating viewpoint-induced semantic drift. Extensive experiments on ScanNet v2, Matterport3D, and nuScenes demonstrate the superior performance of GeoGuide.

Xujing Tao, Chuxin Wang, Yubo Ai, Zhixin Cheng, Zhuoyuan Li, Liangsheng Liu, Yujia Chen, Xinjun Li, Qiao Li, Wenfei Yang, Tianzhu Zhang• 2026

Related benchmarks

TaskDatasetResultRank
3D Semantic SegmentationScanNet V2 (val)
mIoU64.8
209
3D Semantic SegmentationMatterport3D (test)
mIoU51.9
32
3D Semantic SegmentationMatterport3D K=40 (test)
mIoU38.5
17
3D Semantic SegmentationMatterport3D K=80 (test)
mIoU22
17
3D Semantic SegmentationMatterport3D K=160 (test)
mIoU11.6
17
3D Semantic SegmentationMatterport3D 1.0 (test)
mAcc66.3
14
3D Semantic SegmentationnuScenes 1.0 (val)
mIoU50.3
13
Showing 7 of 7 rows

Other info

Follow for update