Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Mask3D: Mask Transformer for 3D Semantic Instance Segmentation

About

Modern 3D semantic instance segmentation approaches predominantly rely on specialized voting mechanisms followed by carefully designed geometric clustering techniques. Building on the successes of recent Transformer-based methods for object detection and image segmentation, we propose the first Transformer-based approach for 3D semantic instance segmentation. We show that we can leverage generic Transformer building blocks to directly predict instance masks from 3D point clouds. In our model called Mask3D each object instance is represented as an instance query. Using Transformer decoders, the instance queries are learned by iteratively attending to point cloud features at multiple scales. Combined with point features, the instance queries directly yield all instance masks in parallel. Mask3D has several advantages over current state-of-the-art approaches, since it neither relies on (1) voting schemes which require hand-selected geometric properties (such as centers) nor (2) geometric grouping mechanisms requiring manually-tuned hyper-parameters (e.g. radii) and (3) enables a loss that directly optimizes instance masks. Mask3D sets a new state-of-the-art on ScanNet test (+6.2 mAP), S3DIS 6-fold (+10.1 mAP), STPLS3D (+11.2 mAP) and ScanNet200 test (+12.4 mAP).

Jonas Schult, Francis Engelmann, Alexander Hermans, Or Litany, Siyu Tang, Bastian Leibe• 2022

Related benchmarks

TaskDatasetResultRank
3D Object DetectionScanNet V2 (val)--
352
3D Instance SegmentationScanNet V2 (val)
Average AP5073.7
195
3D Instance SegmentationScanNet v2 (test)
mAP56.6
135
3D Object DetectionScanNet
mAP@0.2571
123
3D Instance SegmentationS3DIS (Area 5)
mAP@50% IoU71.9
106
3D Instance SegmentationS3DIS (6-fold CV)
Mean Precision @50% IoU76.5
92
Instance SegmentationScanNetV2 (val)
mAP@0.573.7
58
Instance SegmentationScanNet200 (val)
mAP@5037
53
3D Instance SegmentationScanNet200 (val)
mAP27.4
52
3D Instance SegmentationScanNet (val)
mAP@0.2583.5
19
Showing 10 of 33 rows

Other info

Code

Follow for update