Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

AdapterTune: Zero-Initialized Low-Rank Adapters for Frozen Vision Transformers

About

Frozen-backbone transfer with Vision Transformers faces two under-addressed issues: optimization instability when adapters are naively inserted into a fixed feature extractor, and the absence of principled guidance for setting adapter capacity. We introduce AdapterTune, which augments each transformer block with a residual low-rank bottleneck whose up-projection is zero-initialized, guaranteeing that the adapted network starts exactly at the pretrained function and eliminates early-epoch representation drift. On the analytical side, we formalize adapter rank as a capacity budget for approximating downstream task shifts in feature space. The resulting excess-risk decomposition predicts monotonic but diminishing accuracy gains with increasing rank, an ``elbow'' behavior we confirm through controlled sweeps. We evaluate on 9 datasets and 3 backbone scales with multi-seed reporting throughout. On a core 5 dataset transfer suite, AdapterTune improves top-1 accuracy over head-only transfer by +14.9 points on average while training only 0.92 of the parameters required by full fine-tuning, and outperforms full fine-tuning on 10 of 15 dataset-backbone pairs. Across the full benchmark, AdapterTune improves over head-only transfer on every dataset-backbone pair tested. Ablations on rank, placement, and initialization isolate each design choice. The code is available at: https://github.com/salimkhazem/adaptertune

Salim Khazem• 2026

Related benchmarks

TaskDatasetResultRank
Image ClassificationFlowers102
Accuracy99.43
558
Image ClassificationFood-101--
542
Image ClassificationCIFAR-10
Accuracy98.9
508
Image ClassificationTiny-ImageNet
Top-1 Accuracy90
230
Image ClassificationImageNet-R--
217
Image ClassificationCIFAR-100
Accuracy91.2
117
Image ClassificationFGVC Aircraft
Top-1 Acc74.79
92
Image ClassificationOxford-IIIT Pet
Top-1 Accuracy94.3
55
Image ClassificationSVHN
Accuracy97.5
43
Showing 9 of 9 rows

Other info

GitHub

Follow for update