Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

OpenMask3D: Open-Vocabulary 3D Instance Segmentation

About

We introduce the task of open-vocabulary 3D instance segmentation. Current approaches for 3D instance segmentation can typically only recognize object categories from a pre-defined closed set of classes that are annotated in the training datasets. This results in important limitations for real-world applications where one might need to perform tasks guided by novel, open-vocabulary queries related to a wide variety of objects. Recently, open-vocabulary 3D scene understanding methods have emerged to address this problem by learning queryable features for each point in the scene. While such a representation can be directly employed to perform semantic segmentation, existing methods cannot separate multiple object instances. In this work, we address this limitation, and propose OpenMask3D, which is a zero-shot approach for open-vocabulary 3D instance segmentation. Guided by predicted class-agnostic 3D instance masks, our model aggregates per-mask features via multi-view fusion of CLIP-based image embeddings. Experiments and ablation studies on ScanNet200 and Replica show that OpenMask3D outperforms other open-vocabulary methods, especially on the long-tail distribution. Qualitative experiments further showcase OpenMask3D's ability to segment object properties based on free-form queries describing geometry, affordances, and materials.

Ay\c{c}a Takmaz, Elisabetta Fedele, Robert W. Sumner, Marc Pollefeys, Federico Tombari, Francis Engelmann• 2023

Related benchmarks

TaskDatasetResultRank
3D Semantic SegmentationScanNet (val)
mIoU34
100
Semantic segmentationScanNet V2
mIoU34
54
Instance SegmentationScanNet200 (val)
mAP@5019.9
53
3D Instance SegmentationScanNet200 (val)
mAP15.4
52
3D Instance SegmentationScanNet200
mAP@0.519.9
29
3D Semantic SegmentationScanNet200 (val)
mIoU (All Classes)10.3
14
Class-agnostic 3D instance segmentationScanNet200 (val)
AP39.7
12
Semantic segmentationScanNet200
mIoU10.5
12
Open-vocabulary 3D instance segmentationScanNet200 v1 (val)
AP Novel15
7
Open-vocabulary 3D instance segmentationScanNet200 (val)
mAP15.4
7
Showing 10 of 23 rows

Other info

Code

Follow for update