Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Balanced LoRA: Removing Parameter Invariance to Accelerate Convergence

About

Low-Rank Adaptation (LoRA) is the most widely adopted method for fine-tuning large language models. Notably, LoRA is inherently overparameterized: multiple pairs of low-rank factors can yield the same adapted weight matrix. We show--both theoretically and empirically--that these pairs exhibit significantly different condition numbers. As a result, converging to different loss minimizers directly impacts the convergence rate of LoRA. Building on this observation, we introduce Balanced Low-Rank Adaptation (BaLoRA), a variant of LoRA that projects iterates onto a balanced manifold. This manifold improves the conditioning of the loss landscape while preserving the adapted matrix. The projection step is computationally lightweight and integrates seamlessly into existing fine-tuning pipelines. Empirically, BaLoRA converges faster than standard LoRA and achieves superior performance across a range of fine-tuning tasks.

Val\'erie Castin, Kimia Nadjahi, Pierre Ablin, Gabriel Peyr\'e• 2026

Related benchmarks

TaskDatasetResultRank
Language ModelingWikitext-2 raw v1
Loss2.251
10
Mathematical ReasoningGSM8K (test)
Loss0.493
10
Fine-tuningAlpaca
Evaluation Loss1.35
7
Fine-tuningOpenOrca
Evaluation Loss0.773
7
Fine-tuningCodeFeedback
Evaluation Loss0.638
7
Fine-tuningWizardLM
Evaluation Loss0.662
7
Fine-tuningOpenHermes
Evaluation Loss0.707
7
Mathematical ReasoningMetaMathQA 100k-samples
Loss0.144
5
Showing 8 of 8 rows

Other info

Follow for update