Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts

About

The rapid progress in 3D scene understanding has come with growing demand for data; however, collecting and annotating 3D scenes (e.g. point clouds) are notoriously hard. For example, the number of scenes (e.g. indoor rooms) that can be accessed and scanned might be limited; even given sufficient data, acquiring 3D labels (e.g. instance masks) requires intensive human labor. In this paper, we explore data-efficient learning for 3D point cloud. As a first step towards this direction, we propose Contrastive Scene Contexts, a 3D pre-training method that makes use of both point-level correspondences and spatial contexts in a scene. Our method achieves state-of-the-art results on a suite of benchmarks where training data or labels are scarce. Our study reveals that exhaustive labelling of 3D point clouds might be unnecessary; and remarkably, on ScanNet, even using 0.1% of point labels, we still achieve 89% (instance segmentation) and 96% (semantic segmentation) of the baseline performance that uses full annotations.

Ji Hou, Benjamin Graham, Matthias Nie{\ss}ner, Saining Xie• 2020

Related benchmarks

TaskDatasetResultRank
Semantic segmentationS3DIS (Area 5)
mIOU72.2
799
3D Object DetectionScanNet V2 (val)--
352
Semantic segmentationScanNet V2 (val)
mIoU73.8
288
Semantic segmentationScanNet v2 (test)
mIoU73.8
248
Semantic segmentationScanNet (val)
mIoU73.8
231
3D Instance SegmentationScanNet V2 (val)
Average AP5059.4
195
3D Semantic SegmentationScanNet V2 (val)
mIoU73.8
171
3D Object DetectionSUN RGB-D (val)--
158
3D Visual GroundingScanRefer (val)--
155
3D Object DetectionScanNet--
123
Showing 10 of 67 rows

Other info

Follow for update