Unlearnable Examples: Making Personal Data Unexploitable

About

The volume of "free" data on the internet has been key to the current success of deep learning. However, it also raises privacy concerns about the unauthorized exploitation of personal data for training commercial models. It is thus crucial to develop methods to prevent unauthorized data exploitation. This paper raises the question: \emph{can data be made unlearnable for deep learning models?} We present a type of \emph{error-minimizing} noise that can indeed make training examples unlearnable. Error-minimizing noise is intentionally generated to reduce the error of one or more of the training example(s) close to zero, which can trick the model into believing there is "nothing" to learn from these example(s). The noise is restricted to be imperceptible to human eyes, and thus does not affect normal data utility. We empirically verify the effectiveness of error-minimizing noise in both sample-wise and class-wise forms. We also demonstrate its flexibility under extensive experimental settings and practicability in a case study of face recognition. Our work establishes an important first step towards making personal data unexploitable to deep learning models.

Hanxun Huang, Xingjun Ma, Sarah Monazam Erfani, James Bailey, Yisen Wang• 2021

Related benchmarks

Task	Dataset	Result
Image Classification	CIFAR-100 (test)	--	3518
Image Classification	CIFAR-10 (test)	Accuracy89.15	3381
Image Classification	Tiny ImageNet (test)	Accuracy66.47	722
Image Classification	CIFAR-10	Accuracy91.42	564
Image Classification	SVHN (test)	--	470
Image Classification	CIFAR-100 (test)	Top-1 Accuracy67.93	395
Image Classification	CIFAR-10	Accuracy24.17	246
Image Classification	ImageNet (test)	--	235
Image Classification	SUN397 (test)	Top-1 Accuracy38.48	231
Image Classification	Flowers (test)	Accuracy50.58	183

Showing 10 of 35 rows

Other info

Follow for update

@wizwand_team Discord