Top-Down Beats Bottom-Up in 3D Instance Segmentation
About
Most 3D instance segmentation methods exploit a bottom-up strategy, typically including resource-exhaustive post-processing. For point grouping, bottom-up methods rely on prior assumptions about the objects in the form of hyperparameters, which are domain-specific and need to be carefully tuned. On the contrary, we address 3D instance segmentation with a TD3D: the pioneering cluster-free, fully-convolutional and entirely data-driven approach trained in an end-to-end manner. This is the first top-down method outperforming bottom-up approaches in 3D domain. With its straightforward pipeline, it demonstrates outstanding accuracy and generalization ability on the standard indoor benchmarks: ScanNet v2, its extension ScanNet200, and S3DIS, as well as on the aerial STPLS3D dataset. Besides, our method is much faster on inference than the current state-of-the-art grouping-based approaches: our flagship modification is 1.9x faster than the most accurate bottom-up method, while being more accurate, and our faster modification shows state-of-the-art accuracy running at 2.6x speed. Code is available at https://github.com/SamsungLabs/td3d .
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| 3D Instance Segmentation | ScanNet V2 (val) | Average AP5071.2 | 195 | |
| 3D Instance Segmentation | ScanNet v2 (test) | mAP48.9 | 135 | |
| 3D Instance Segmentation | S3DIS (Area 5) | mAP@50% IoU65.1 | 106 | |
| 3D Instance Segmentation | S3DIS (6-fold CV) | Mean Precision @50% IoU76.3 | 92 | |
| 3D Instance Segmentation | ScanNet hidden v2 (test) | Cabinet AP@0.532.2 | 69 | |
| Instance Segmentation | ScanNet200 (val) | mAP@5034.8 | 53 | |
| 3D Instance Segmentation | ScanNet200 (val) | mAP23.1 | 52 | |
| 3D Instance Segmentation | ScanNet (val) | mAP@0.2581.9 | 19 | |
| 3D Instance Segmentation | ScanNet online benchmark | mAP@2564 | 18 | |
| Instance Segmentation | S3DIS Area-5 (val) | mAP5065.1 | 16 |