MoRAL: MoE Augmented LoRA for LLMs' Lifelong Learning

About

Adapting large language models (LLMs) to new domains/tasks and enabling them to be efficient lifelong learners is a pivotal challenge. In this paper, we propose MoRAL, i.e., Mixture-of-Experts augmented Low-Rank Adaptation for Lifelong Learning. MoRAL combines the multi-tasking abilities of MoE with the fine-tuning abilities of LoRA for effective life-long learning of LLMs. In contrast to the conventional approaches that use factual triplets as inputs MoRAL relies on simple question-answer pairs, which is a more practical and effective strategy for robust and efficient learning. Owing to new data settings, we introduce a new evaluation benchmark namely: Life Long Learning of LLM (5L-bench) encompassing a newly curated dataset of question-answer pairs, and a set of evaluation metrics for rigorous evaluation of MoRAL in open-book and closed-book settings. Experimental evaluation shows (i) LLMs learn fast in open-book settings with up to 30.15% improvement in "RA" for Phi-2-2.7B compared to closed-book (for models fine-tuned with MoRAL); (ii) MoRAL shows higher performance improvement for models with a greater number of parameters; (iii) MoRAL is robust to catastrophic forgetting offering better knowledge retention compared to baselines.

Shu Yang, Muhammad Asif Ali, Cheng-Long Wang, Lijie Hu, Di Wang• 2024

Related benchmarks

Task	Dataset	Result
Commonsense Reasoning	HellaSwag	HellaSwag Accuracy77.57	711
Question Answering	ARC	Accuracy52.13	230
Bias Evaluation	BBQ	Accuracy58.67	171
Question Answering	TruthfulQA	Accuracy55.09	164
Question Answering	CommonsenseQA	Accuracy81.57	150
Question Answering	ScienceQA	Accuracy93.79	96
Toxicity Detection	Toxigen	Score54.74	95
Truthful QA	Truthful QA	Accuracy57.5	83
Language Understanding	MMLU	Accuracy51.1	43
Question Answering	OpenBookQA	Accuracy85.8	23

Showing 10 of 13 rows

Other info

Follow for update

@wizwand_team Discord