Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Sensitivity-Aware Visual Parameter-Efficient Fine-Tuning

About

Visual Parameter-Efficient Fine-Tuning (PEFT) has become a powerful alternative for full fine-tuning so as to adapt pre-trained vision models to downstream tasks, which only tunes a small number of parameters while freezing the vast majority ones to ease storage burden and optimization difficulty. However, existing PEFT methods introduce trainable parameters to the same positions across different tasks depending solely on human heuristics and neglect the domain gaps. To this end, we study where to introduce and how to allocate trainable parameters by proposing a novel Sensitivity-aware visual Parameter-efficient fine-Tuning (SPT) scheme, which adaptively allocates trainable parameters to task-specific important positions given a desired tunable parameter budget. Specifically, our SPT first quickly identifies the sensitive parameters that require tuning for a given task in a data-dependent way. Next, our SPT further boosts the representational capability for the weight matrices whose number of sensitive parameters exceeds a pre-defined threshold by utilizing existing structured tuning methods, e.g., LoRA [23] or Adapter [22], to replace directly tuning the selected sensitive parameters (unstructured tuning) under the budget. Extensive experiments on a wide range of downstream recognition tasks show that our SPT is complementary to the existing PEFT methods and largely boosts their performance, e.g., SPT improves Adapter with supervised pre-trained ViT-B/16 backbone by 4.2% and 1.4% mean Top-1 accuracy, reaching SOTA performance on FGVC and VTAB-1k benchmarks, respectively. Source code is at https://github.com/ziplab/SPT

Haoyu He, Jianfei Cai, Jing Zhang, Dacheng Tao, Bohan Zhuang• 2023

Related benchmarks

TaskDatasetResultRank
Semantic segmentationADE20K (val)
mIoU45.4
2731
Image ClassificationFood-101--
494
Image ClassificationImageNet-R
Top-1 Acc72.6
474
Image ClassificationSVHN--
359
Image ClassificationCIFAR-100--
302
Image ClassificationVTAB 1K
Overall Mean Accuracy76.4
204
Image ClassificationVTAB-1K 1.0 (test)
Natural Accuracy76.6
102
Visual Task AdaptationVTAB 1K
Average Accuracy76.4
78
Fine-grained Visual CategorizationFGVC
Mean Accuracy90.1
40
Visual Task AdaptationVTAB-1k v1 (test)
Mean Accuracy78.7
29
Showing 10 of 24 rows

Other info

Follow for update