Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Unlearning as multi-task optimization: A normalized gradient difference approach with an adaptive learning rate

About

Machine unlearning has been used to remove unwanted knowledge acquired by large language models (LLMs). In this paper, we examine machine unlearning from an optimization perspective, framing it as a regularized multi-task optimization problem, where one task optimizes a forgetting objective and another optimizes the model performance. In particular, we introduce a normalized gradient difference (NGDiff) algorithm, enabling us to have better control over the trade-off between the objectives, while integrating a new, automatic learning rate scheduler. We provide a theoretical analysis and empirically demonstrate the superior performance of NGDiff among state-of-the-art unlearning methods on the TOFU and MUSE datasets while exhibiting stable training.

Zhiqi Bu, Xiaomeng Jin, Bhanukiran Vinzamuri, Anil Ramakrishna, Kai-Wei Chang, Volkan Cevher, Mingyi Hong• 2024

Related benchmarks

TaskDatasetResultRank
General Language UnderstandingMMLU
MMLU Score60.8
39
Adversarial RobustnessMUSE-Book Harry Potter
ASR17.9
11
Machine UnlearningMUSE-Book Harry Potter
Forget13.6
11
Neighbor Domain Knowledge PreservationMUSE-Book Harry Potter
Retain Accuracy37.7
11
LLM UnlearningWMDP cyber
Forget Rate17.3
8
Showing 5 of 5 rows

Other info

Follow for update