OccuSeg: Occupancy-aware 3D Instance Segmentation
About
3D instance segmentation, with a variety of applications in robotics and augmented reality, is in large demands these days. Unlike 2D images that are projective observations of the environment, 3D models provide metric reconstruction of the scenes without occlusion or scale ambiguity. In this paper, we define "3D occupancy size", as the number of voxels occupied by each instance. It owns advantages of robustness in prediction, on which basis, OccuSeg, an occupancy-aware 3D instance segmentation scheme is proposed. Our multi-task learning produces both occupancy signal and embedding representations, where the training of spatial and feature embeddings varies with their difference in scale-aware. Our clustering scheme benefits from the reliable comparison between the predicted occupancy size and the clustered occupancy size, which encourages hard samples being correctly clustered and avoids over segmentation. The proposed approach achieves state-of-the-art performance on 3 real-world datasets, i.e. ScanNetV2, S3DIS and SceneNN, while maintaining high efficiency.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| 3D Instance Segmentation | ScanNet V2 (val) | Average AP5060.7 | 195 | |
| 3D Instance Segmentation | ScanNet v2 (test) | mAP48.6 | 135 | |
| 3D Semantic Segmentation | ScanNet v2 (test) | mIoU76.4 | 110 | |
| 3D Instance Segmentation | S3DIS (Area 5) | -- | 106 | |
| 3D Semantic Segmentation | ScanNet (test) | mIoU76.4 | 105 | |
| 3D Instance Segmentation | S3DIS (6-fold CV) | Mean Precision @50% IoU72.8 | 92 | |
| 3D Semantic Segmentation | ScanNet v1 (test) | -- | 72 | |
| 3D Instance Segmentation | ScanNet hidden v2 (test) | Cabinet AP@0.557.6 | 69 | |
| Instance Segmentation | S3DIS (6-fold CV) | mPrec72.8 | 40 | |
| Instance Segmentation | ScanNet (val) | mAP44.2 | 39 |