Robustness and Accuracy Could Be Reconcilable by (Proper) Definition
About
The trade-off between robustness and accuracy has been widely studied in the adversarial literature. Although still controversial, the prevailing view is that this trade-off is inherent, either empirically or theoretically. Thus, we dig for the origin of this trade-off in adversarial training and find that it may stem from the improperly defined robust error, which imposes an inductive bias of local invariance -- an overcorrection towards smoothness. Given this, we advocate employing local equivariance to describe the ideal behavior of a robust model, leading to a self-consistent robust error named SCORE. By definition, SCORE facilitates the reconciliation between robustness and accuracy, while still handling the worst-case uncertainty via robust optimization. By simply substituting KL divergence with variants of distance metrics, SCORE can be efficiently minimized. Empirically, our models achieve top-rank performance on RobustBench under AutoAttack. Besides, SCORE provides instructive insights for explaining the overfitting phenomenon and semantic input gradients observed on robust models. Code is available at https://github.com/P2333/SCORE.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Image Classification | CIFAR-10 (test) | Accuracy (Clean)82.95 | 273 | |
| Image Classification | CIFAR10 (train) | Accuracy92.92 | 90 | |
| Image Classification | CIFAR-100 (test) | Clean Accuracy65.56 | 61 | |
| Image Classification | CIFAR-100 1x10^6 EDM-generated images-augmented (train) | Cleantr85.22 | 18 | |
| Image Classification | CIFAR-10 DM-AT (train) | Clean Accuracy94.76 | 18 | |
| Image Classification | CIFAR-100 1x10^6 EDM-generated images-augmented (test) | Cleante Accuracy63.12 | 18 | |
| Image Classification | CIFAR-10 DM-AT (test) | Clean Accuracy88.66 | 18 | |
| Image Classification | SVHN (test) | Accuracy (Clean)97.75 | 17 | |
| Adversarial Robustness | CIFAR-100 | Final Auto-Attack Accuracy31.4 | 16 | |
| Image Classification | SVHN (train) | Clean Accuracy97.43 | 15 |