LoFT: Low-Rank Adaptation That Behaves Like Full Fine-Tuning
About
Large pre-trained models are commonly adapted to downstream tasks using parameter-efficient fine-tuning methods such as Low-Rank Adaptation (LoRA), which injects small trainable low-rank matrices instead of updating all weights. While LoRA dramatically reduces trainable parameters with little overhead, it can still underperform full fine-tuning in accuracy and often converges more slowly. We introduce LoFT, a novel low-rank adaptation method that behaves like full fine-tuning by aligning the optimizer's internal dynamics with those of updating all model weights. LoFT not only learns weight updates in a low-rank subspace (like LoRA) but also properly projects the optimizer's first and second moments (Adam's momentum and variance) into the same subspace, mirroring full-model updates. By aligning the low-rank update itself with the full update, LoFT eliminates the need for tuning extra hyperparameters, e.g., the LoRA scaling factor $\alpha$. Empirically, this approach substantially narrows the performance gap between adapter-based tuning and full fine-tuning and consistently outperforms standard LoRA-style methods, all without increasing inference cost.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Code Generation | HumanEval | Pass@148.5 | 1036 | |
| Language Modeling | WikiText2 (val) | Perplexity (PPL)19.26 | 387 | |
| Code Generation | HumanEval+ | Pass@143.6 | 383 | |
| Commonsense Reasoning | Common Sense Reasoning Tasks | Avg Score90.66 | 316 | |
| Commonsense Reasoning | Commonsense Reasoning (BoolQ, PIQA, SIQA, HellaS., WinoG., ARC-e, ARC-c, OBQA) (test) | BoolQ Accuracy75.63 | 202 | |
| Image Classification | DomainNet | Accuracy71.97 | 87 | |
| Image Classification | HAM10000 | Accuracy93.13 | 19 | |
| Image Classification | ISIC 2019 | Accuracy81.06 | 6 | |
| Image Classification | Diabetic Retinopathy (test) | Accuracy58.49 | 3 |