Consistency Regularization for Certified Robustness of Smoothed Classifiers
About
A recent technique of randomized smoothing has shown that the worst-case (adversarial) $\ell_2$-robustness can be transformed into the average-case Gaussian-robustness by "smoothing" a classifier, i.e., by considering the averaged prediction over Gaussian noise. In this paradigm, one should rethink the notion of adversarial robustness in terms of generalization ability of a classifier under noisy observations. We found that the trade-off between accuracy and certified robustness of smoothed classifiers can be greatly controlled by simply regularizing the prediction consistency over noise. This relationship allows us to design a robust training objective without approximating a non-existing smoothed classifier, e.g., via soft smoothing. Our experiments under various deep neural network architectures and datasets show that the "certified" $\ell_2$-robustness can be dramatically improved with the proposed regularization, even achieving better or comparable results to the state-of-the-art approaches with significantly less training costs and hyperparameters.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Image Classification | MNIST | -- | 263 | |
| Image Classification | CIFAR-10 corrupted (test) | Acc84.7 | 30 | |
| Certified Image Classification | MNIST (test) | Certified Accuracy (r=0.00)99.43 | 27 | |
| Image Classification Certified Robustness | MNIST (test) | Overall ACR1.74 | 27 | |
| Certified Robustness | CIFAR-10 (test) | Accuracy (Standard)89.5 | 26 | |
| Certified Robust Classification | CIFAR-10 official (test) | ACR0.72 | 14 | |
| Image Classification | CIFAR-10.1 1.0 (test) | Accuracy67.6 | 14 | |
| Image Classification | ImageNet sub-sampled 500 samples (val) | ACR0.982 | 8 |