Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Certified Unlearning for Neural Networks

About

We address the problem of machine unlearning, where the goal is to remove the influence of specific training data from a model upon request, motivated by privacy concerns and regulatory requirements such as the "right to be forgotten." Unfortunately, existing methods rely on restrictive assumptions or lack formal guarantees. To this end, we propose a novel method for certified machine unlearning, leveraging the connection between unlearning and privacy amplification by stochastic post-processing. Our method uses noisy fine-tuning on the retain data, i.e., data that does not need to be removed, to ensure provable unlearning guarantees. This approach requires no assumptions about the underlying loss function, making it broadly applicable across diverse settings. We analyze the theoretical trade-offs in efficiency and accuracy and demonstrate empirically that our method not only achieves formal unlearning guarantees but also performs effectively in practice, outperforming existing baselines. Our code is available at https://github.com/stair-lab/certified-unlearning-neural-networks-icml-2025

Anastasia Koloskova, Youssef Allouah, Animesh Jha, Rachid Guerraoui, Sanmi Koyejo• 2025

Related benchmarks

TaskDatasetResultRank
Full-class forgettingMNIST (retain)
Accuracy84
22
Full-class forgettingIris (test)
Accuracy93.3
11
Full-class forgettingFashion MNIST (test)
Accuracy84.13
11
Full-class forgettingMNIST (test)
Accuracy73.5
11
Machine UnlearningFashion-MNIST 2% subset
Accuracy0.9421
11
Subset ForgettingMNIST (test)
Acc77.2
11
Full-class forgettingFashion-MNIST (retain)
Acc95.24
11
Machine UnlearningFashion-MNIST 2% (test)
Accuracy86.56
11
Full-class forgettingIris (retain)
Acc88.8
11
Machine UnlearningIris subset 2% (retained)
Accuracy91.5
11
Showing 10 of 12 rows

Other info

Follow for update