Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Machine Unlearning of Features and Labels

About

Removing information from a machine learning model is a non-trivial task that requires to partially revert the training process. This task is unavoidable when sensitive data, such as credit card numbers or passwords, accidentally enter the model and need to be removed afterwards. Recently, different concepts for machine unlearning have been proposed to address this problem. While these approaches are effective in removing individual data points, they do not scale to scenarios where larger groups of features and labels need to be reverted. In this paper, we propose the first method for unlearning features and labels. Our approach builds on the concept of influence functions and realizes unlearning through closed-form updates of model parameters. It enables to adapt the influence of training data on a learning model retrospectively, thereby correcting data leaks and privacy issues. For learning models with strongly convex loss functions, our method provides certified unlearning with theoretical guarantees. For models with non-convex losses, we empirically show that unlearning features and labels is effective and significantly faster than other strategies.

Alexander Warnecke, Lukas Pirch, Christian Wressnegger, Konrad Rieck• 2021

Related benchmarks

TaskDatasetResultRank
Class UnlearningCIFAR-10
Retain Accuracy89.33
60
Machine UnlearningCIFAR-100 (test)
Retain Acc0.9958
45
Machine UnlearningCIFAR-10--
45
Machine UnlearningTiny-ImageNet (train)
Forgetting Accuracy (Train)69.98
43
Machine UnlearningCIFAR-100 In Class Random Forgetting
RA (Utility Retention)98.22
40
Class UnlearningCIFAR-10 (test)
Df6.89
35
Single-class UnlearningCIFAR-100
ACCr73.37
28
Single-class UnlearningMNIST
Accuracy Retention (ACCr)0.9943
28
Machine UnlearningCIFAR-10 1.0 (test)
Test Acc95.08
24
Machine UnlearningCIFAR-10 30% random data forgetting--
24
Showing 10 of 64 rows

Other info

Follow for update