Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts

About

The rapid progress in 3D scene understanding has come with growing demand for data; however, collecting and annotating 3D scenes (e.g. point clouds) are notoriously hard. For example, the number of scenes (e.g. indoor rooms) that can be accessed and scanned might be limited; even given sufficient data, acquiring 3D labels (e.g. instance masks) requires intensive human labor. In this paper, we explore data-efficient learning for 3D point cloud. As a first step towards this direction, we propose Contrastive Scene Contexts, a 3D pre-training method that makes use of both point-level correspondences and spatial contexts in a scene. Our method achieves state-of-the-art results on a suite of benchmarks where training data or labels are scarce. Our study reveals that exhaustive labelling of 3D point clouds might be unnecessary; and remarkably, on ScanNet, even using 0.1% of point labels, we still achieve 89% (instance segmentation) and 96% (semantic segmentation) of the baseline performance that uses full annotations.

Ji Hou, Benjamin Graham, Matthias Nie{\ss}ner, Saining Xie• 2020

Related benchmarks

TaskDatasetResultRank
Semantic segmentationS3DIS (Area 5)
mIOU72.2
1006
Semantic segmentationScanNet V2 (val)
mIoU73.8
380
3D Object DetectionScanNet V2 (val)--
361
Semantic segmentationScanNet (val)
mIoU73.8
302
3D Visual GroundingScanRefer (val)--
253
Semantic segmentationScanNet v2 (test)
mIoU73.8
248
3D Semantic SegmentationScanNet V2 (val)
mIoU73.8
209
3D Instance SegmentationScanNet V2 (val)
Average AP5059.4
198
3D Object DetectionSUN RGB-D (val)--
163
Semantic segmentationScanNet200 (val)
mIoU26.9
136
Showing 10 of 70 rows

Other info

Follow for update