Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Take-A-Photo: 3D-to-2D Generative Pre-training of Point Cloud Models

About

With the overwhelming trend of mask image modeling led by MAE, generative pre-training has shown a remarkable potential to boost the performance of fundamental models in 2D vision. However, in 3D vision, the over-reliance on Transformer-based backbones and the unordered nature of point clouds have restricted the further development of generative pre-training. In this paper, we propose a novel 3D-to-2D generative pre-training method that is adaptable to any point cloud model. We propose to generate view images from different instructed poses via the cross-attention mechanism as the pre-training scheme. Generating view images has more precise supervision than its point cloud counterpart, thus assisting 3D backbones to have a finer comprehension of the geometrical structure and stereoscopic relations of the point cloud. Experimental results have proved the superiority of our proposed 3D-to-2D generative pre-training over previous pre-training methods. Our method is also effective in boosting the performance of architecture-oriented approaches, achieving state-of-the-art performance when fine-tuning on ScanObjectNN classification and ShapeNetPart segmentation tasks. Code is available at https://github.com/wangzy22/TAP.

Ziyi Wang, Xumin Yu, Yongming Rao, Jie Zhou, Jiwen Lu• 2023

Related benchmarks

TaskDatasetResultRank
Part SegmentationShapeNetPart (test)
mIoU (Inst.)86.9
312
Object ClassificationScanObjectNN OBJ_ONLY
Overall Accuracy89.5
166
ClassificationModelNet40 (test)--
99
Few-shot classificationModelNet40 5-way 10-shot
Accuracy97.3
79
Few-shot classificationModelNet40 10-way 20-shot
Accuracy95.8
79
Few-shot classificationModelNet40 10-way 10-shot
Accuracy93.1
79
Few-shot classificationModelNet40 5-way 20-shot
Accuracy97.8
79
Point Cloud ClassificationScanObjectNN OBJ_BG
Overall Accuracy90.36
64
Point Cloud ClassificationScanObjectNN PB_T50_RS
Overall Accuracy88.5
63
Object ClassificationScanObjectNN OBJ_ONLY v1.0
Accuracy89.5
29
Showing 10 of 23 rows

Other info

Code

Follow for update