Certified Data Removal from Machine Learning Models

About

Good data stewardship requires removal of data at the request of the data's owner. This raises the question if and how a trained machine-learning model, which implicitly stores information about its training data, should be affected by such a removal request. Is it possible to "remove" data from a machine-learning model? We study this problem by defining certified removal: a very strong theoretical guarantee that a model from which data is removed cannot be distinguished from a model that never observed the data to begin with. We develop a certified-removal mechanism for linear classifiers and empirically study learning settings in which this mechanism is practical.

Chuan Guo, Tom Goldstein, Awni Hannun, Laurens van der Maaten• 2019

Related benchmarks

Task	Dataset	Result
Membership Inference Attack	NYU V2	AUC96.51	90
Semantic segmentation	NYU v2 (val)	mIoU74.12	82
Depth Estimation	NYU v2 (val)	--	72
Machine Unlearning	MNIST	Model Accuracy99.08	66
Semantic segmentation	NYU v2 (Retained set)	mIoU92.13	37
Multi-task Unlearning Interference	NYU V2	UIS30.4	34
Depth Estimation	NYU v2 (Retained set)	Acc (sigma 1.25)83.32	33
Surface Normal Estimation	NYU v2 (Retained set)	A3058.47	33
Surface Normal Prediction	NYU Forget set v2 (train)	A30 Error0.4593	30
Surface Normal Prediction	NYU v2 (val)	A30 Accuracy51.38	30

Showing 10 of 17 rows

Other info

Follow for update

@wizwand_team Discord