Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

1%>100%: High-Efficiency Visual Adapter with Complex Linear Projection Optimization

About

Deploying vision foundation models typically relies on efficient adaptation strategies, whereas conventional full fine-tuning suffers from prohibitive costs and low efficiency. While delta-tuning has proven effective in boosting the performance and efficiency of LLMs during adaptation, its advantages cannot be directly transferred to the fine-tuning pipeline of vision foundation models. To push the boundaries of adaptation efficiency for vision tasks, we propose an adapter with Complex Linear Projection Optimization (CoLin). For architecture, we design a novel low-rank complex adapter that introduces only about 1% parameters to the backbone. For efficiency, we theoretically prove that low-rank composite matrices suffer from severe convergence issues during training, and address this challenge with a tailored loss. Extensive experiments on object detection, segmentation, image classification, and rotated object detection (remote sensing scenario) demonstrate that CoLin outperforms both full fine-tuning and classical delta-tuning approaches with merely 1% parameters for the first time, providing a novel and efficient solution for deployment of vision foundation models. We release the code on https://github.com/DongshuoYin/CoLin.

Dongshuo Yin, Xue Yang, Deng-Ping Fan, Shi-Min Hu• 2026

Related benchmarks

TaskDatasetResultRank
Semantic segmentationADE20K (val)
mIoU51.28
2731
Image ClassificationFlowers102 (test)
Accuracy99.6619
68
Oriented Object DetectionSTAR (test)
AP39.22
60
Rotated Object DetectionDOTA 1.0 (test)
mAP78.39
46
Object DetectionPascal VOC (test)
mAP87.5
18
Instance SegmentationCOCO
APMask45.5
10
Object DetectionCOCO
APBox52.9
10
Showing 7 of 7 rows

Other info

Follow for update