Forgetting Outside the Box: Scrubbing Deep Networks of Information Accessible from Input-Output Observations
About
We describe a procedure for removing dependency on a cohort of training data from a trained deep network that improves upon and generalizes previous methods to different readout functions and can be extended to ensure forgetting in the activations of the network. We introduce a new bound on how much information can be extracted per query about the forgotten cohort from a black-box network for which only the input-output behavior is observed. The proposed forgetting procedure has a deterministic part derived from the differential equations of a linearized version of the model, and a stochastic part that ensures information destruction by adding noise tailored to the geometry of the loss landscape. We exploit the connections between the activation and weight dynamics of a DNN inspired by Neural Tangent Kernels to compute the information in the activations.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Machine Unlearning | MNIST | Model Accuracy88.88 | 44 | |
| Selective Unlearning | Lacuna 10 (test) | Test Error (mean)1.87 | 36 | |
| Class Unlearning | Lacuna 10 (test) | Test Error (Mean)1.78 | 27 | |
| Resolving Confusion | Lacuna-5 (test) | Test Error12.87 | 27 | |
| Selective Unlearning | CIFAR-10 | Test Error21.23 | 27 | |
| Class Unlearning | Lacuna-10 | Test Error3.33 | 27 | |
| Machine Unlearning | CIFAR-10 | Accuracy62.81 | 24 | |
| Machine Unlearning | UCI Adult | Accuracy84.34 | 24 | |
| Class Unlearning | CIFAR-10 (test) | -- | 21 | |
| Resolving Confusion | CIFAR-10 5-class (test) | IC Test Error36.7 | 20 |