Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Towards Compositional Adversarial Robustness: Generalizing Adversarial Training to Composite Semantic Perturbations

About

Model robustness against adversarial examples of single perturbation type such as the $\ell_{p}$-norm has been widely studied, yet its generalization to more realistic scenarios involving multiple semantic perturbations and their composition remains largely unexplored. In this paper, we first propose a novel method for generating composite adversarial examples. Our method can find the optimal attack composition by utilizing component-wise projected gradient descent and automatic attack-order scheduling. We then propose generalized adversarial training (GAT) to extend model robustness from $\ell_{p}$-ball to composite semantic perturbations, such as the combination of Hue, Saturation, Brightness, Contrast, and Rotation. Results obtained using ImageNet and CIFAR-10 datasets indicate that GAT can be robust not only to all the tested types of a single attack, but also to any combination of such attacks. GAT also outperforms baseline $\ell_{\infty}$-norm bounded adversarial training approaches by a significant margin.

Lei Hsiung, Yun-Yun Tsai, Pin-Yu Chen, Tsung-Yi Ho• 2022

Related benchmarks

TaskDatasetResultRank
Adversarial Attack Success RateCIFAR-10
Clean Success Rate0.00e+0
12
Image ClassificationCIFAR-10
Accuracy (Clean)0.00e+0
12
Image ClassificationCIFAR-10 (test)
Clean Accuracy83.4
12
Image ClassificationCIFAR-10
Clean Accuracy83.4
12
Image ClassificationImageNet (test)--
6
Image ClassificationSVHN
Clean Accuracy93.6
4
Robust AccuracySVHN
Accuracy (Clean)93.6
4
Adversarial AttackSVHN (test)
Clean Accuracy0.00e+0
4
Attack Success RateSVHN (test)
Clean Success Rate0.00e+0
4
Image ClassificationImageNet (test)
Clean Accuracy60
4
Showing 10 of 12 rows

Other info

Code

Follow for update