BLUR: A Bi-Level Optimization Approach for LLM Unlearning

About

Enabling large language models (LLMs) to unlearn knowledge and capabilities acquired during training has proven vital for ensuring compliance with data regulations and promoting ethical practices in generative AI. Although there are growing interests in developing various unlearning algorithms, it remains unclear how to best formulate the unlearning problem. The most popular formulation uses a weighted sum of forget and retain loss, but it often leads to performance degradation due to the inherent trade-off between forget and retain losses. In this work, we argue that it is important to model the hierarchical structure of the unlearning problem, where the forget problem (which \textit{unlearns} certain knowledge and/or capabilities) takes priority over the retain problem (which preserves model utility). This hierarchical structure naturally leads to a bi-level optimization formulation where the lower-level objective focuses on minimizing the forget loss, while the upper-level objective aims to maintain the model's utility. Based on this new formulation, we propose a novel algorithm, termed Bi-Level UnleaRning (\texttt{BLUR}), which not only possesses strong theoretical guarantees but more importantly, delivers superior performance. In particular, our extensive experiments demonstrate that \texttt{BLUR} consistently outperforms all the state-of-the-art algorithms across various unlearning tasks, models, and metrics. Codes are available at https://github.com/OptimAI-Lab/BLURLLMUnlearning.

Hadi Reisizadeh, Jinghan Jia, Zhiqi Bu, Bhanukiran Vinzamuri, Anil Ramakrishna, Kai-Wei Chang, Volkan Cevher, Sijia Liu, Mingyi Hong• 2025

Related benchmarks

Task	Dataset	Result
Instruction Following	IFEval	IFEval Accuracy31.3	836
Multi-task Language Understanding	MMLU	MMLU Accuracy57.7	442
Multi-task Language Understanding	MMLU	Accuracy57.1	353
Question Answering	TruthfulQA	Accuracy39.2	164
Natural Language Inference	MNLI	--	80
Natural Language Inference	QNLI	Accuracy68	78
Language Modeling	MMLU	MMLU Final Performance46	42
Language Understanding	MMLU	MMLU Score37.4	40
Question Answering	TruthfulQA	TruthfulQA27.1	37
Machine Unlearning	MUSE NEWS	VerbMem (Df)0.00e+0	34

Showing 10 of 36 rows

Other info

Follow for update

@wizwand_team Discord