MUNBa: Machine Unlearning via Nash Bargaining

About

Machine Unlearning (MU) aims to selectively erase harmful behaviors from models while retaining the overall utility of the model. As a multi-task learning problem, MU involves balancing objectives related to forgetting specific concepts/data and preserving general performance. A naive integration of these forgetting and preserving objectives can lead to gradient conflicts and dominance, impeding MU algorithms from reaching optimal solutions. To address the gradient conflict and dominance issue, we reformulate MU as a two-player cooperative game, where the two players, namely, the forgetting player and the preservation player, contribute via their gradient proposals to maximize their overall gain and balance their contributions. To this end, inspired by the Nash bargaining theory, we derive a closed-form solution to guide the model toward the Pareto stationary point. Our formulation of MU guarantees an equilibrium solution, where any deviation from the final state would lead to a reduction in the overall objectives for both players, ensuring optimality in each objective. We evaluate our algorithm's effectiveness on a diverse set of tasks across image classification and image generation. Extensive experiments with ResNet, vision-language model CLIP, and text-to-image diffusion models demonstrate that our method outperforms state-of-the-art MU algorithms, achieving a better trade-off between forgetting and preserving. Our results also highlight improvements in forgetting precision, preservation of generalization, and robustness against adversarial attacks.

Jing Wu, Mehrtash Harandi• 2024

Related benchmarks

Task	Dataset	Result
Machine Unlearning	Tiny-ImageNet (train)	--	43
Object Classification Unlearning	CIFAR-10 (10% random data forgetting)	UA0.6	25
Machine Unlearning	Tiny ImageNet (test)	--	23
Full-class unlearning	Tiny-ImageNet	Retention Accuracy (RA)64.22	21
Machine Unlearning	CIFAR-100 (train)	Accuracy ($D_f$)33.8	19
Random subset unlearning	SVHN	Retention Accuracy (RA)99.73	15
Random subset unlearning	CIFAR-10	Retention Accuracy (RA)100	15
Sub-class Machine Unlearning	CIFAR-20 Rocket sub-class	RA81.43	15
Full-class unlearning	CIFAR-100	Retention Accuracy (RA)74.09	15
Sub-class Machine Unlearning	CIFAR-20 Sea sub-class	Retained Accuracy (RA)80.64	15

Showing 10 of 21 rows

Other info

Follow for update

@wizwand_team Discord