Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

ARKit LabelMaker: A New Scale for Indoor 3D Scene Understanding

About

Neural network performance scales with both model size and data volume, as shown in both language and image processing. This requires scaling-friendly architectures and large datasets. While transformers have been adapted for 3D vision, a `GPT-moment' remains elusive due to limited training data. We introduce ARKit LabelMaker, a large-scale real-world 3D dataset with dense semantic annotation that is more than three times larger than prior largest dataset. Specifically, we extend ARKitScenes with automatically generated dense 3D labels using an extended LabelMaker pipeline, tailored for large-scale pre-training. Training on our dataset improves accuracy across architectures, achieving state-of-the-art 3D semantic segmentation scores on ScanNet and ScanNet200, with notable gains on tail classes. Our code is available at https://labelmaker.org and our dataset at https://huggingface.co/datasets/labelmaker/arkit_labelmaker.

Guangda Ji, Silvan Weder, Francis Engelmann, Marc Pollefeys, Hermann Blum• 2024

Related benchmarks

TaskDatasetResultRank
3D Semantic SegmentationScanNet20 v2 (test)
mIoU79.8
24
Semantic segmentationScanNet++ (val)
mIoU45.3
16
3D Semantic SegmentationScanNet200 (test)--
15
3D Semantic SegmentationScanNet200 (val)
mIoU (All Classes)37.5
14
3D Semantic SegmentationScanNet20 (val)
mIoU79.1
9
Region SegmentationMatterport3D Top-40 NYU Classes (test)
mAcc53.6
7
Region SegmentationMatterport3D Top-80 NYU Classes (test)
mAcc38.2
7
Region SegmentationMatterport3D NYU Classes Top-160 (test)
mAcc26
7
Instance SegmentationScanNet200 30 (test)
mAP@0.2542.9
4
Instance SegmentationScanNet 9 (test)
AP@0.2578.5
4
Showing 10 of 10 rows

Other info

Code

Follow for update