Unified Gradient-Based Machine Unlearning with Remain Geometry Enhancement

About

Machine unlearning (MU) has emerged to enhance the privacy and trustworthiness of deep neural networks. Approximate MU is a practical method for large-scale models. Our investigation into approximate MU starts with identifying the steepest descent direction, minimizing the output Kullback-Leibler divergence to exact MU inside a parameters' neighborhood. This probed direction decomposes into three components: weighted forgetting gradient ascent, fine-tuning retaining gradient descent, and a weight saliency matrix. Such decomposition derived from Euclidean metric encompasses most existing gradient-based MU methods. Nevertheless, adhering to Euclidean space may result in sub-optimal iterative trajectories due to the overlooked geometric structure of the output probability space. We suggest embedding the unlearning update into a manifold rendered by the remaining geometry, incorporating second-order Hessian from the remaining data. It helps prevent effective unlearning from interfering with the retained performance. However, computing the second-order Hessian for large-scale models is intractable. To efficiently leverage the benefits of Hessian modulation, we propose a fast-slow parameter update strategy to implicitly approximate the up-to-date salient unlearning direction. Free from specific modal constraints, our approach is adaptable across computer vision unlearning tasks, including classification and generation. Extensive experiments validate our efficacy and efficiency. Notably, our method successfully performs class-forgetting on ImageNet using DiT and forgets a class on CIFAR-10 using DDPM in just 50 steps, compared to thousands of steps required by previous methods.

Zhehao Huang, Xinwen Cheng, JingHao Zheng, Haoran Wang, Zhengbao He, Tao Li, Xiaolin Huang• 2024

Related benchmarks

Task	Dataset	Result
General Knowledge Evaluation	MMLU	MMLU Accuracy61.7	127
Utility Evaluation	MMLU	MMLU Score47.2	45
Machine Unlearning	CIFAR-10 Random Forget 10% (train)	Retain Accuracy92.86	37
Knowledge Retention	WMDP retain	Retain47.1	36
Machine Unlearning	WMDP-bio 1.0 (test)	BF16 Accuracy72.4	28
Machine Unlearning	WMDP-cyber 1.0 (test)	BF16 Score49.1	28
Machine Unlearning	WMDP-chem 1.0 (test)	BF160.514	28
Object Classification Unlearning	CIFAR-10 (10% random data forgetting)	UA4.8	25
Machine Unlearning	Tiny ImageNet (test)	Residual Accuracy97.94	23
Unlearning	WMDP (forget split)	BF16 Precision55.4	22

Showing 10 of 53 rows

Other info

Follow for update

@wizwand_team Discord