Towards Compositional Adversarial Robustness: Generalizing Adversarial Training to Composite Semantic Perturbations

About

Model robustness against adversarial examples of single perturbation type such as the $\ell_{p}$-norm has been widely studied, yet its generalization to more realistic scenarios involving multiple semantic perturbations and their composition remains largely unexplored. In this paper, we first propose a novel method for generating composite adversarial examples. Our method can find the optimal attack composition by utilizing component-wise projected gradient descent and automatic attack-order scheduling. We then propose generalized adversarial training (GAT) to extend model robustness from $\ell_{p}$-ball to composite semantic perturbations, such as the combination of Hue, Saturation, Brightness, Contrast, and Rotation. Results obtained using ImageNet and CIFAR-10 datasets indicate that GAT can be robust not only to all the tested types of a single attack, but also to any combination of such attacks. GAT also outperforms baseline $\ell_{\infty}$-norm bounded adversarial training approaches by a significant margin.

Lei Hsiung, Yun-Yun Tsai, Pin-Yu Chen, Tsung-Yi Ho• 2022

Related benchmarks

Task	Dataset	Result
Adversarial Attack Success Rate	CIFAR-10	Clean Success Rate0.00e+0	12
Image Classification	CIFAR-10	Accuracy (Clean)0.00e+0	12
Image Classification	CIFAR-10 (test)	Clean Accuracy83.4	12
Image Classification	CIFAR-10	Clean Accuracy83.4	12
Image Classification	ImageNet (test)	--	6
Image Classification	SVHN	Clean Accuracy93.6	4
Robust Accuracy	SVHN	Accuracy (Clean)93.6	4
Adversarial Attack	SVHN (test)	Clean Accuracy0.00e+0	4
Attack Success Rate	SVHN (test)	Clean Success Rate0.00e+0	4
Image Classification	ImageNet (test)	Clean Accuracy60	4

Showing 10 of 12 rows

Other info

Code

Follow for update

@wizwand_team Discord