Little by Little: Continual Learning via Incremental Mixture of Rank-1 Associative Memory Experts

About

Continual learning (CL) with large pre-trained models aims to incrementally acquire knowledge without catastrophic forgetting. Existing LoRA-based Mixture-of-Experts (MoE) methods expand capacity by adding isolated new experts while freezing old ones, but still suffer from redundancy, interference, routing ambiguity, and consequent forgetting. We investigate the issues stemming from coarse-grained expert granularity. Coarse-grained experts (e.g., high-rank LoRA) encode low-specialty information, leading to expert duplication/interference and routing degradation/confusion as experts accumulate. In this work, we propose MoRAM (Mixture of Rank-1 Associative Memory). Grounded in the view that weight matrices act as linear associative memories, MoRAM achieves CL as incremental expansion of reusable atomic rank-1 experts as memory. Each rank-1 adapter acts as a fine-grained MoE expert or an associative memory unit. By viewing rank-1 experts as key-value memory pairs, we eliminate explicit MoE-LoRA routers with self-activation, where each memory atom evaluates its relevance via its intrinsic key. The inference process thus becomes a content-addressable retrieval and recall over the incrementally accumulated memory of learning snapshots. Extensive experiments on CLIP and LLMs show that MoRAM significantly outperforms state-of-the-art methods, achieving a better plasticity-stability trade-off, stronger generalization, and reduced forgetting. Project Page: https://artificer-ai-lab.github.io/MoRAM/.

Haodong Lu, Chongyang Zhao, Minhui Xue, Lina Yao, Kristen Moore, Dong Gong• 2025

Related benchmarks

Task	Dataset	Result
Continual Learning	TRACE	BWT (%)3.12	124
Continual Learning	Standard CL Benchmark	Avg Final Acc0.776	71
Continual Learning	Standard CL benchmark (Yelp, Amazon, DBpedia, Yahoo, AG News) latest (test)	Accuracy (CL Suite Test)79.3	57
Continual Learning	Large Number of Tasks	Average Performance69.7	50
Multi-domain Task-Incremental Learning	MTIL Order I 5-shot (test)	Accuracy (Caltech101)95.4	46
Continual Learning	Continual Learning Benchmark 15-Task	Average Accuracy68.32	28
Continual Learning	X-TAIL	Average Score80.9	27
Continual Learning	SuperNI	AP51.79	13
Continual Learning	15-task Sequence Order-6	Average Accuracy71.95	12
Image Classification	X-TAIL Average	Aircraft Accuracy81.6	12

Showing 10 of 14 rows

Other info

Follow for update

@wizwand_team Discord