Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

LoR-VP: Low-Rank Visual Prompting for Efficient Vision Model Adaptation

About

Visual prompting has gained popularity as a method for adapting pre-trained models to specific tasks, particularly in the realm of parameter-efficient tuning. However, existing visual prompting techniques often pad the prompt parameters around the image, limiting the interaction between the visual prompts and the original image to a small set of patches while neglecting the inductive bias present in shared information across different patches. In this study, we conduct a thorough preliminary investigation to identify and address these limitations. We propose a novel visual prompt design, introducing Low-Rank matrix multiplication for Visual Prompting (LoR-VP), which enables shared and patch-specific information across rows and columns of image pixels. Extensive experiments across seven network architectures and four datasets demonstrate significant improvements in both performance and efficiency compared to state-of-the-art visual prompting methods, achieving up to 6 times faster training times, utilizing 18 times fewer visual prompt parameters, and delivering a 3.1% improvement in performance. The code is available as https://github.com/jincan333/LoR-VP.

Can Jin, Ying Li, Mingyu Zhao, Shiyu Zhao, Zhenting Wang, Xiaoxiao He, Ligong Han, Tong Che, Dimitris N. Metaxas• 2025

Related benchmarks

TaskDatasetResultRank
Image ClassificationCIFAR-10
Accuracy97.52
875
Image ClassificationImageNet V2--
749
Image ClassificationTiny ImageNet (test)
Accuracy89.78
722
Image ClassificationImageNet A
Top-1 Acc19.96
698
Image ClassificationCIFAR-100 (test)--
395
Image ClassificationCIFAR-100
Accuracy88.06
357
Image ClassificationTiny-ImageNet
Accuracy (%)85.77
131
Image-to-Text RetrievalFlickr30k (val)
Recall@190.3
70
Text-to-Image RetrievalFlickr30k (val)
R@173.46
70
Text-to-Image RetrievalCOCO 2017
Recall@568.17
43
Showing 10 of 18 rows

Other info

Follow for update