LoR-VP: Low-Rank Visual Prompting for Efficient Vision Model Adaptation

About

Visual prompting has gained popularity as a method for adapting pre-trained models to specific tasks, particularly in the realm of parameter-efficient tuning. However, existing visual prompting techniques often pad the prompt parameters around the image, limiting the interaction between the visual prompts and the original image to a small set of patches while neglecting the inductive bias present in shared information across different patches. In this study, we conduct a thorough preliminary investigation to identify and address these limitations. We propose a novel visual prompt design, introducing Low-Rank matrix multiplication for Visual Prompting (LoR-VP), which enables shared and patch-specific information across rows and columns of image pixels. Extensive experiments across seven network architectures and four datasets demonstrate significant improvements in both performance and efficiency compared to state-of-the-art visual prompting methods, achieving up to 6 times faster training times, utilizing 18 times fewer visual prompt parameters, and delivering a 3.1% improvement in performance. The code is available as https://github.com/jincan333/LoR-VP.

Can Jin, Ying Li, Mingyu Zhao, Shiyu Zhao, Zhenting Wang, Xiaoxiao He, Ligong Han, Tong Che, Dimitris N. Metaxas• 2025

Related benchmarks

Task	Dataset	Result
Image Classification	CIFAR-10	Accuracy97.52	973
Image Classification	Tiny ImageNet (test)	Accuracy89.78	859
Image Classification	ImageNet V2	--	767
Image Classification	ImageNet A	Top-1 Acc19.96	723
Image Classification	CIFAR-100 (test)	--	429
Image Classification	CIFAR-100	Accuracy88.06	375
Image Classification	VTAB 1K	Overall Mean Accuracy74.69	359
Image Classification	VTAB 1k (test)	Accuracy (Natural)72.28	145
Image Classification	Tiny-ImageNet	Accuracy (%)85.77	134
Visual Task Adaptation	VTAB 1K	--	95

Showing 10 of 22 rows

Other info

Follow for update

@wizwand_team Discord