Top-Down Beats Bottom-Up in 3D Instance Segmentation

About

Most 3D instance segmentation methods exploit a bottom-up strategy, typically including resource-exhaustive post-processing. For point grouping, bottom-up methods rely on prior assumptions about the objects in the form of hyperparameters, which are domain-specific and need to be carefully tuned. On the contrary, we address 3D instance segmentation with a TD3D: the pioneering cluster-free, fully-convolutional and entirely data-driven approach trained in an end-to-end manner. This is the first top-down method outperforming bottom-up approaches in 3D domain. With its straightforward pipeline, it demonstrates outstanding accuracy and generalization ability on the standard indoor benchmarks: ScanNet v2, its extension ScanNet200, and S3DIS, as well as on the aerial STPLS3D dataset. Besides, our method is much faster on inference than the current state-of-the-art grouping-based approaches: our flagship modification is 1.9x faster than the most accurate bottom-up method, while being more accurate, and our faster modification shows state-of-the-art accuracy running at 2.6x speed. Code is available at https://github.com/SamsungLabs/td3d .

Maksim Kolodiazhnyi, Anna Vorontsova, Anton Konushin, Danila Rukhovich• 2023

Related benchmarks

Task	Dataset	Result
3D Instance Segmentation	ScanNet V2 (val)	Average AP5071.2	198
3D Instance Segmentation	ScanNet v2 (test)	mAP48.9	135
3D Instance Segmentation	S3DIS (Area 5)	mAP@50% IoU65.1	120
3D Instance Segmentation	S3DIS (6-fold CV)	Mean Precision @50% IoU76.3	92
3D Instance Segmentation	ScanNet200 (val)	mAP23.1	78
Instance Segmentation	ScanNet200 (val)	mAP@5034.8	72
3D Instance Segmentation	ScanNet hidden v2 (test)	Cabinet AP@0.532.2	69
3D Instance Segmentation	ScanNet (val)	mAP@0.2581.9	19
3D Instance Segmentation	ScanNet online benchmark	mAP@2564	18
Instance Segmentation	S3DIS Area-5 (val)	mAP5065.1	16

Showing 10 of 13 rows

Other info

Code

Follow for update

@wizwand_team Discord