Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

OneFormer3D: One Transformer for Unified Point Cloud Segmentation

About

Semantic, instance, and panoptic segmentation of 3D point clouds have been addressed using task-specific models of distinct design. Thereby, the similarity of all segmentation tasks and the implicit relationship between them have not been utilized effectively. This paper presents a unified, simple, and effective model addressing all these tasks jointly. The model, named OneFormer3D, performs instance and semantic segmentation consistently, using a group of learnable kernels, where each kernel is responsible for generating a mask for either an instance or a semantic category. These kernels are trained with a transformer-based decoder with unified instance and semantic queries passed as an input. Such a design enables training a model end-to-end in a single run, so that it achieves top performance on all three segmentation tasks simultaneously. Specifically, our OneFormer3D ranks 1st and sets a new state-of-the-art (+2.1 mAP50) in the ScanNet test leaderboard. We also demonstrate the state-of-the-art results in semantic, instance, and panoptic segmentation of ScanNet (+21 PQ), ScanNet200 (+3.8 mAP50), and S3DIS (+0.8 mIoU) datasets.

Maxim Kolodiazhnyi, Anna Vorontsova, Anton Konushin, Danila Rukhovich• 2023

Related benchmarks

TaskDatasetResultRank
Semantic segmentationS3DIS (Area 5)
mIOU72.4
1006
Semantic segmentationScanNet V2 (val)
mIoU76.6
380
3D Object DetectionScanNet V2 (val)
mAP@0.2576.9
361
Semantic segmentationS3DIS (6-fold)
mIoU (Mean IoU)75
344
3D Instance SegmentationScanNet V2 (val)
Average AP5076.3
198
3D Semantic SegmentationScanNet (val)
mIoU76.6
144
Semantic segmentationScanNet200 (val)
mIoU30.1
136
3D Instance SegmentationScanNet v2 (test)
mAP56.6
135
3D Object DetectionScanNet
mAP@0.2576.9
127
3D Instance SegmentationS3DIS (Area 5)
mAP@50% IoU68.5
120
Showing 10 of 42 rows

Other info

Follow for update