Unlocking High-Accuracy Differentially Private Image Classification through Scale

About

Differential Privacy (DP) provides a formal privacy guarantee preventing adversaries with access to a machine learning model from extracting information about individual training points. Differentially Private Stochastic Gradient Descent (DP-SGD), the most popular DP training method for deep learning, realizes this protection by injecting noise during training. However previous works have found that DP-SGD often leads to a significant degradation in performance on standard image classification benchmarks. Furthermore, some authors have postulated that DP-SGD inherently performs poorly on large models, since the norm of the noise required to preserve privacy is proportional to the model dimension. In contrast, we demonstrate that DP-SGD on over-parameterized models can perform significantly better than previously thought. Combining careful hyper-parameter tuning with simple techniques to ensure signal propagation and improve the convergence rate, we obtain a new SOTA without extra data on CIFAR-10 of 81.4% under (8, 10^{-5})-DP using a 40-layer Wide-ResNet, improving over the previous SOTA of 71.7%. When fine-tuning a pre-trained NFNet-F3, we achieve a remarkable 83.8% top-1 accuracy on ImageNet under (0.5, 8*10^{-7})-DP. Additionally, we also achieve 86.7% top-1 accuracy under (8, 8 \cdot 10^{-7})-DP, which is just 4.3% below the current non-private SOTA for this task. We believe our results are a significant step towards closing the accuracy gap between private and non-private image classification.

Soham De, Leonard Berrada, Jamie Hayes, Samuel L. Smith, Borja Balle• 2022

Related benchmarks

Task	Dataset	Result
Image Classification	ImageNet 1k (test)	Top-1 Accuracy86.7	939
Image Classification	CIFAR10 (test)	Accuracy88.9	585
Image Classification	ImageNet (test)	Top-1 Accuracy32.4	299
Image Classification	ImageNet (val)	Top-1 Accuracy32.4	188
Image Classification	Places-365 (val)	--	54
Image Classification	CIFAR-10 resized to 224 × 224 (test)	Accuracy96.6	15
Image Classification	CIFAR-100 resized to 224 × 224 (test)	Accuracy81.8	12
Image Classification	CIFAR-10 down-sampled to 32x32 (test)	Median Accuracy96.6	11
Image Classification	CIFAR-10 (test)	Median Test Accuracy89	10
Image Classification	CIFAR-100 down-sampled to 32x32 (test)	Median Accuracy81.8	8

Showing 10 of 13 rows

Other info

Code

Follow for update

@wizwand_team Discord