Certified Adversarial Robustness via Randomized Smoothing
About
We show how to turn any classifier that classifies well under Gaussian noise into a new classifier that is certifiably robust to adversarial perturbations under the $\ell_2$ norm. This "randomized smoothing" technique has been proposed recently in the literature, but existing guarantees are loose. We prove a tight robustness guarantee in $\ell_2$ norm for smoothing with Gaussian noise. We use randomized smoothing to obtain an ImageNet classifier with e.g. a certified top-1 accuracy of 49% under adversarial perturbations with $\ell_2$ norm less than 0.5 (=127/255). No certified defense has been shown feasible on ImageNet except for smoothing. On smaller-scale datasets where competing approaches to certified $\ell_2$ robustness are viable, smoothing delivers higher certified accuracies. Our strong empirical results suggest that randomized smoothing is a promising direction for future research into adversarially robust classification. Code and models are available at http://github.com/locuslab/smoothing.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Image Classification | MNIST | -- | 263 | |
| Image Classification | CIFAR-10 corrupted (test) | Acc88.3 | 30 | |
| Certified Image Classification | MNIST (test) | Certified Accuracy (r=0.00)99.25 | 27 | |
| Image Classification Certified Robustness | MNIST (test) | Overall ACR1.62 | 27 | |
| Certified Robustness | CIFAR-10 (test) | Accuracy (Standard)92.7 | 26 | |
| Image Classification | CIFAR-10.1 1.0 (test) | Accuracy76.7 | 14 | |
| Certified Robust Classification | CIFAR-10 official (test) | ACR0.525 | 14 | |
| Certified Accuracy | CIFAR-10 (test) | Certified Accuracy (r=0.0)65 | 9 | |
| Image Classification | ImageNet sub-sampled 500 samples (val) | ACR0.875 | 8 | |
| Image Classification | ImageNet 10-class subset (test) | Certified Accuracy (eps=0.00)93.4 | 4 |