LoFT: Low-Rank Adaptation That Behaves Like Full Fine-Tuning

About

Large pre-trained models are commonly adapted to downstream tasks using parameter-efficient fine-tuning methods such as Low-Rank Adaptation (LoRA), which injects small trainable low-rank matrices instead of updating all weights. While LoRA dramatically reduces trainable parameters with little overhead, it can still underperform full fine-tuning in accuracy and often converges more slowly. We introduce LoFT, a novel low-rank adaptation method that behaves like full fine-tuning by aligning the optimizer's internal dynamics with those of updating all model weights. LoFT not only learns weight updates in a low-rank subspace (like LoRA) but also properly projects the optimizer's first and second moments (Adam's momentum and variance) into the same subspace, mirroring full-model updates. By aligning the low-rank update itself with the full update, LoFT eliminates the need for tuning extra hyperparameters, e.g., the LoRA scaling factor $\alpha$. Empirically, this approach substantially narrows the performance gap between adapter-based tuning and full fine-tuning and consistently outperforms standard LoRA-style methods, all without increasing inference cost.

Nurbek Tastan, Stefanos Laskaridis, Martin Takac, Karthik Nandakumar, Samuel Horvath• 2025

Related benchmarks

Task	Dataset	Result
Code Generation	HumanEval	Pass@148.5	1043
Language Modeling	WikiText2 (val)	Perplexity (PPL)19.26	423
Code Generation	HumanEval+	Pass@143.6	393
Commonsense Reasoning	Common Sense Reasoning Tasks	Avg Score90.66	321
Commonsense Reasoning	Commonsense Reasoning (BoolQ, PIQA, SIQA, HellaS., WinoG., ARC-e, ARC-c, OBQA) (test)	BoolQ Accuracy75.63	238
Image Classification	DomainNet	Accuracy71.97	95
Image Classification	HAM10000	Accuracy93.13	19
Image Classification	ISIC 2019	Accuracy81.06	6
Image Classification	Diabetic Retinopathy (test)	Accuracy58.49	3

Showing 9 of 9 rows

Other info

Follow for update

@wizwand_team Discord