Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

LoRA-FA: Efficient and Effective Low Rank Representation Fine-tuning

About

Fine-tuning large language models (LLMs) is crucial for improving their performance on downstream tasks, but full-parameter fine-tuning (Full-FT) is computationally expensive and memory-intensive. Parameter-efficient fine-tuning (PEFT) methods, such as Low-Rank Adaptation (LoRA), address this by optimizing only a small subset of parameters. However, LoRA may underperform Full-FT in certain scenarios due to the intrinsic limitations of its low-rank gradients. In this work, we reveal an asymmetric, collapsible structure in LoRA's update: the low-rank modification to W can be reformulated as a single-layer linear regression, implying that one of the LoRA factors can be frozen without sacrificing expressivity. Leveraging this insight, we introduce LoRA-FA, which freezes the projection-down matrix A and trains only the projection-up matrix B. We further close the gap to Full-FT by deriving closed-form gradient corrections that minimize the discrepancy between the induced low-rank gradient and the full gradient. Through extensive experiments on diverse benchmarks, including GLUE, GSM8K, MT-Bench, and HumanEval, we demonstrate that LoRA-FA consistently achieves comparable performance to existing PEFT methods and Full-FT. Experiments on system efficiency show that LoRA-FA significantly reduces activation memory consumption and computational workload in fine-tuning. Our code is available at https://github.com/huggingface/peft.

Longteng Zhang, Lin Zhang, Shaohuai Shi, Xiaowen Chu, Bo Li• 2023

Related benchmarks

TaskDatasetResultRank
Code GenerationHumanEval
Pass@115.91
1043
Commonsense ReasoningPIQA
Accuracy75.97
757
Natural Language UnderstandingGLUE
SST-293.65
551
Reading ComprehensionRACE high
Accuracy79.03
295
Image ClassificationVTAB 1K
Overall Mean Accuracy68.2
281
Common Sense ReasoningHellaSwag
Accuracy89.16
213
Reading ComprehensionRACE mid
Accuracy82.79
196
Common Sense ReasoningWinoGrande
Accuracy0.8216
189
Mathematical ReasoningGSM8K (val)
Accuracy40.25
108
Code GenerationMBPP
Pass@1 Accuracy20.01
59
Showing 10 of 35 rows

Other info

Follow for update