Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Collaborative Low-Rank Adaptation for Pre-Trained Vision Transformers

About

Low-rank adaptation (LoRA) has achieved remarkable success in fine-tuning pre-trained vision transformers for various downstream tasks. Existing studies mainly focus on exploring more parameter-efficient strategies or more effective representation learning schemes. However, these methods either sacrifice fine-tuning performance or introduce excessive trainable parameters, failing to strike a balance between learning performance and parameter efficiency. To address this problem, we propose a novel tuning method named collaborative low-rank adaptation (CLoRA) in this paper. CLoRA consists of base-space sharing and sample-agnostic diversity enhancement (SADE) components. To maintain parameter efficiency while expanding the learning capacity of low-rank modules (LRMs), base-space sharing allows all LRMs to share a set of down/up-projection spaces. In CLoRA, the low-rank matrices obtained from the shared spaces collaboratively construct each LRM. Since the representations extracted by these matrices may contain redundant information, SADE is employed to regularize the similarities among them to encourage diverse representations in the training process. We conduct extensive experiments on widely used image and point cloud datasets to evaluate the performance of CLoRA. Experimental results demonstrate that CLoRA strikes a better balance between learning performance and parameter efficiency, while requiring the fewest GFLOPs for point cloud analysis, compared with the state-of-the-art methods.

Zheng Liu, Jinchao Zhu, Gao Huang• 2025

Related benchmarks

TaskDatasetResultRank
Part SegmentationShapeNetPart (test)
mIoU (Inst.)85.7
312
Point Cloud ClassificationModelNet40 (test)
Accuracy93.8
224
Image ClassificationVTAB-1K 1.0 (test)
Natural Accuracy84.7
102
Point Cloud ClassificationScanObjectNN PB_T50_RS (test)
Overall Accuracy89.51
91
3D Object RecognitionScanObjectNN OBJ_BG (test)
Top-1 Accuracy94.32
35
Fine-grained Visual CategorizationFGVC (CUB-200-2011, NABirds, Oxford Flowers, Stanford Cars, Stanford Dogs) (test)
CUB-200-2011 Accuracy89.2
32
Point Cloud Object RecognitionScanObjectNN OBJ_ONLY (test)
Accuracy92.77
21
Point Cloud ClassificationModelNet40 5-way 20-shot
Accuracy98.2
21
Point Cloud ClassificationModelNet40 10-way 20-shot
Accuracy95.7
21
Point Cloud ClassificationModelNet40 5-way 10-shot
Accuracy96.8
21
Showing 10 of 11 rows

Other info

Follow for update