Instance-aware Dynamic Prompt Tuning for Pre-trained Point Cloud Models

About

Pre-trained point cloud models have found extensive applications in 3D understanding tasks like object classification and part segmentation. However, the prevailing strategy of full fine-tuning in downstream tasks leads to large per-task storage overhead for model parameters, which limits the efficiency when applying large-scale pre-trained models. Inspired by the recent success of visual prompt tuning (VPT), this paper attempts to explore prompt tuning on pre-trained point cloud models, to pursue an elegant balance between performance and parameter efficiency. We find while instance-agnostic static prompting, e.g. VPT, shows some efficacy in downstream transfer, it is vulnerable to the distribution diversity caused by various types of noises in real-world point cloud data. To conquer this limitation, we propose a novel Instance-aware Dynamic Prompt Tuning (IDPT) strategy for pre-trained point cloud models. The essence of IDPT is to develop a dynamic prompt generation module to perceive semantic prior features of each point cloud instance and generate adaptive prompt tokens to enhance the model's robustness. Notably, extensive experiments demonstrate that IDPT outperforms full fine-tuning in most tasks with a mere 7% of the trainable parameters, providing a promising solution to parameter-efficient learning for pre-trained point cloud models. Code is available at \url{https://github.com/zyh16143998882/ICCV23-IDPT}.

Yaohua Zha, Jinpeng Wang, Tao Dai, Bin Chen, Zhi Wang, Shu-Tao Xia• 2023

Related benchmarks

Task	Dataset	Result
Semantic segmentation	S3DIS (Area 5)	mIOU52.6	1006
Part Segmentation	ShapeNetPart (test)	mIoU (Inst.)85.9	347
Object Classification	ScanObjectNN OBJ_BG	Accuracy98.11	248
Part Segmentation	ShapeNetPart	mIoU (Instance)85.9	246
Point Cloud Classification	ModelNet40 (test)	Accuracy93.4	229
Object Classification	ScanObjectNN PB_T50_RS	Accuracy92.99	220
Object Classification	ScanObjectNN OBJ_ONLY	Overall Accuracy96.04	186
Object Classification	ModelNet40 (test)	Accuracy94.4	180
3D Semantic Segmentation	ScanNet (val)	mIoU72.6	144
Few-shot classification	ModelNet40 10-way 20-shot	Accuracy95.5	117

Showing 10 of 62 rows

Other info

Code

Follow for update

@wizwand_team Discord