Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

PointGPT: Auto-regressively Generative Pre-training from Point Clouds

About

Large language models (LLMs) based on the generative pre-training transformer (GPT) have demonstrated remarkable effectiveness across a diverse range of downstream tasks. Inspired by the advancements of the GPT, we present PointGPT, a novel approach that extends the concept of GPT to point clouds, addressing the challenges associated with disorder properties, low information density, and task gaps. Specifically, a point cloud auto-regressive generation task is proposed to pre-train transformer models. Our method partitions the input point cloud into multiple point patches and arranges them in an ordered sequence based on their spatial proximity. Then, an extractor-generator based transformer decoder, with a dual masking strategy, learns latent representations conditioned on the preceding point patches, aiming to predict the next one in an auto-regressive manner. Our scalable approach allows for learning high-capacity models that generalize well, achieving state-of-the-art performance on various downstream tasks. In particular, our approach achieves classification accuracies of 94.9% on the ModelNet40 dataset and 93.4% on the ScanObjectNN dataset, outperforming all other transformer models. Furthermore, our method also attains new state-of-the-art accuracies on all four few-shot learning benchmarks.

Guangyan Chen, Meiling Wang, Yi Yang, Kai Yu, Li Yuan, Yufeng Yue• 2023

Related benchmarks

TaskDatasetResultRank
Part SegmentationShapeNetPart (test)
mIoU (Inst.)86.6
312
Shape classificationModelNet40 (test)
OA94.9
255
Object ClassificationScanObjectNN OBJ_BG
Accuracy95.8
215
Part SegmentationShapeNetPart
mIoU (Instance)86.6
198
Object ClassificationScanObjectNN PB_T50_RS
Accuracy91.9
195
Object ClassificationScanObjectNN OBJ_ONLY
Overall Accuracy95.2
166
ClassificationModelNet40 (test)--
99
Few-shot classificationModelNet40 5-way 20-shot
Accuracy98.6
79
Few-shot classificationModelNet40 5-way 10-shot
Accuracy96.8
79
Few-shot classificationModelNet40 10-way 10-shot
Accuracy92.6
79
Showing 10 of 53 rows

Other info

Code

Follow for update