GAIN: Multiplicative Modulation for Domain Adaptation
About
Adapting LLMs to new domains causes forgetting because standard methods (e.g., full fine-tuning, LoRA) inject new directions into the weight space. We show that forgetting is governed by one algebraic property: whether the update preserves the column span of the pretrained weight matrix (Proposition 1). We propose GAIN, the simplest multiplicative alternative (W_new = S * W), which satisfies this by construction and can be absorbed into existing weights for zero inference cost. Across five models (774M to 70B) adapted sequentially over eight domains, GAIN improves earlier-domain perplexity by 7-13%, while LoRA degrades it by 18-36%. GAIN matches replay-augmented LoRA without storing prior data and dominates EWC on the forgetting-adaptation Pareto front. While LoRA can only reduce forgetting by sacrificing in-domain adaptation, GAIN achieves both with no domain boundaries and no regularization. The principle generalises: (IA)^3, an independent multiplicative method, also improves earlier domains.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Commonsense Reasoning | HellaSwag | -- | 1896 | |
| Commonsense Reasoning | WinoGrande | -- | 1442 | |
| Physical Commonsense Reasoning | PIQA | -- | 696 | |
| Sentence Completion | HellaSwag | -- | 364 | |
| Language Modeling | PG-19 | -- | 206 | |
| Question Answering | ARC-C | -- | 116 | |
| Question Answering | OpenBookQA | Normalized Accuracy0.4 | 102 | |
| Question Answering | ARC-E | Normalized Accuracy (ARC-E)3.8 | 59 | |
| Language Modeling | Medical (Med) | PPL Change (%) vs Baseline0.8 | 30 | |
| Language Modeling | Finance (Fin) | PPL Change (%)0.00e+0 | 28 |