Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Mask-Attention-Free Transformer for 3D Instance Segmentation

About

Recently, transformer-based methods have dominated 3D instance segmentation, where mask attention is commonly involved. Specifically, object queries are guided by the initial instance masks in the first cross-attention, and then iteratively refine themselves in a similar manner. However, we observe that the mask-attention pipeline usually leads to slow convergence due to low-recall initial instance masks. Therefore, we abandon the mask attention design and resort to an auxiliary center regression task instead. Through center regression, we effectively overcome the low-recall issue and perform cross-attention by imposing positional prior. To reach this goal, we develop a series of position-aware designs. First, we learn a spatial distribution of 3D locations as the initial position queries. They spread over the 3D space densely, and thus can easily capture the objects in a scene with a high recall. Moreover, we present relative position encoding for the cross-attention and iterative refinement for more accurate position queries. Experiments show that our approach converges 4x faster than existing work, sets a new state of the art on ScanNetv2 3D instance segmentation benchmark, and also demonstrates superior performance across various datasets. Code and models are available at https://github.com/dvlab-research/Mask-Attention-Free-Transformer.

Xin Lai, Yuhui Yuan, Ruihang Chu, Yukang Chen, Han Hu, Jiaya Jia• 2023

Related benchmarks

TaskDatasetResultRank
3D Object DetectionScanNet V2 (val)
mAP@0.2573.5
352
3D Instance SegmentationScanNet V2 (val)
Average AP5076.5
195
3D Instance SegmentationScanNet v2 (test)
mAP59.6
135
3D Instance SegmentationS3DIS (Area 5)
mAP@50% IoU69.1
106
3D Instance SegmentationScanNet hidden v2 (test)
Cabinet AP@0.546
69
Instance SegmentationScanNetV2 (val)
mAP@0.575.9
58
Instance SegmentationScanNet200 (val)
mAP@5038.2
53
3D Instance SegmentationScanNet200 (val)
mAP29.2
52
3D Instance SegmentationScanNet (test)
mAP59.6
15
Instance SegmentationScanNet (test)
mAP59.6
13
Showing 10 of 16 rows

Other info

Code

Follow for update