Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Merge before Forget: A Single LoRA Continual Learning via Continual Merging

About

Parameter-efficient continual learning has emerged as a promising approach for large language models (LLMs) to mitigate catastrophic forgetting while enabling adaptation to new tasks. Current Low-Rank Adaptation (LoRA) continual learning techniques often retain and freeze previously learned LoRAs or generate data representations to overcome forgetting, typically utilizing these to support new LoRAs learn new tasks. However, these methods not only ignore growing computational memory with tasks and limited storage space but also suffer from potential task interference due to the lack of effective LoRA merging mechanisms. In this paper, we propose a novel continual learning method that orthogonally initializes and sequentially merges LoRAs updates into a single unified LoRA. Our method leverages orthogonal basis extraction from previously learned LoRA to initialize the learning of new tasks, further exploits the intrinsic asymmetry property of LoRA components by using a time-aware scaling mechanism to balance new and old knowledge during continual merging. Our approach maintains constant memory complexity with respect to the number of tasks, minimizes interference between past and new tasks via orthogonal basis initialization, and improves performance over asymmetric LoRA merging via adaptive scaling. We provide theoretical analysis to justify our design and conduct extensive experiments across diverse continual learning benchmarks using various Llama models, demonstrating the effectiveness and efficiency of our method.

Fuli Qiao, Mehrdad Mahdavi• 2025

Related benchmarks

TaskDatasetResultRank
Continual LearningStandard CL Benchmark
Avg Final Acc0.804
50
Continual LearningLarge Number of Tasks
Average Performance74.8
50
Continual LearningSuperNI Benchmark
Average Score37.2
14
Continual LearningLarge Number of Tasks (test)
Backward Transfer (BWT)-3.5
13
Continual LearningSuperNI Standard CL Benchmark (test)
Average Performance81
13
Continual LearningSuperNI Large Number of Tasks (test)
Average Performance76.2
13
Continual LearningSuperNI
O1 Performance41
6
Continual LearningLarge Number of Tasks
MOPD15.17
4
Continual LearningStandard CL Benchmark
MOPD1.72
4
Showing 9 of 9 rows

Other info

Follow for update