LoRA-FA: Efficient and Effective Low Rank Representation Fine-tuning

About

Fine-tuning large language models (LLMs) is crucial for improving their performance on downstream tasks, but full-parameter fine-tuning (Full-FT) is computationally expensive and memory-intensive. Parameter-efficient fine-tuning (PEFT) methods, such as Low-Rank Adaptation (LoRA), address this by optimizing only a small subset of parameters. However, LoRA may underperform Full-FT in certain scenarios due to the intrinsic limitations of its low-rank gradients. In this work, we reveal an asymmetric, collapsible structure in LoRA's update: the low-rank modification to W can be reformulated as a single-layer linear regression, implying that one of the LoRA factors can be frozen without sacrificing expressivity. Leveraging this insight, we introduce LoRA-FA, which freezes the projection-down matrix A and trains only the projection-up matrix B. We further close the gap to Full-FT by deriving closed-form gradient corrections that minimize the discrepancy between the induced low-rank gradient and the full gradient. Through extensive experiments on diverse benchmarks, including GLUE, GSM8K, MT-Bench, and HumanEval, we demonstrate that LoRA-FA consistently achieves comparable performance to existing PEFT methods and Full-FT. Experiments on system efficiency show that LoRA-FA significantly reduces activation memory consumption and computational workload in fine-tuning. Our code is available at https://github.com/huggingface/peft.

Longteng Zhang, Lin Zhang, Shaohuai Shi, Xiaowen Chu, Bo Li• 2023

Related benchmarks

Task	Dataset	Result
Code Generation	HumanEval	Pass@115.91	1043
Commonsense Reasoning	PIQA	Accuracy75.97	757
Natural Language Understanding	GLUE	SST-293.65	551
Reading Comprehension	RACE high	Accuracy79.03	295
Image Classification	VTAB 1K	Overall Mean Accuracy68.2	281
Common Sense Reasoning	HellaSwag	Accuracy89.16	213
Reading Comprehension	RACE mid	Accuracy82.79	196
Common Sense Reasoning	WinoGrande	Accuracy0.8216	189
Mathematical Reasoning	GSM8K (val)	Accuracy40.25	108
Code Generation	MBPP	Pass@1 Accuracy20.01	59

Showing 10 of 35 rows

Other info

Follow for update

@wizwand_team Discord