Make Some Noise: Reliable and Efficient Single-Step Adversarial Training

About

Recently, Wong et al. showed that adversarial training with single-step FGSM leads to a characteristic failure mode named Catastrophic Overfitting (CO), in which a model becomes suddenly vulnerable to multi-step attacks. Experimentally they showed that simply adding a random perturbation prior to FGSM (RS-FGSM) could prevent CO. However, Andriushchenko and Flammarion observed that RS-FGSM still leads to CO for larger perturbations, and proposed a computationally expensive regularizer (GradAlign) to avoid it. In this work, we methodically revisit the role of noise and clipping in single-step adversarial training. Contrary to previous intuitions, we find that using a stronger noise around the clean sample combined with \textit{not clipping} is highly effective in avoiding CO for large perturbation radii. We then propose Noise-FGSM (N-FGSM) that, while providing the benefits of single-step adversarial training, does not suffer from CO. Empirical analyses on a large suite of experiments show that N-FGSM is able to match or surpass the performance of previous state-of-the-art GradAlign, while achieving 3x speed-up. Code can be found in https://github.com/pdejorge/N-FGSM

Pau de Jorge, Adel Bibi, Riccardo Volpi, Amartya Sanyal, Philip H. S. Torr, Gr\'egory Rogez, Puneet K. Dokania• 2022

Related benchmarks

Task	Dataset	Result
Image Classification	ImageNet-100	--	163
Image Classification	ImageNet-100 (test)	Clean Accuracy49.38	123
Image Classification	CIFAR-100	Clean Accuracy57.28	90
Image Classification	CIFAR-10	Clean Accuracy79.44	89
Image Classification	CIFAR-10 (test)	--	82
Adversarial Robustness	CIFAR-10 (test)	--	76
Image Classification	PathMNIST	Clean Accuracy84.01	60
Image Classification	CIFAR-10 (test)	Natural Accuracy80.4	48
Medical Image Classification	PathMNIST	Clean Accuracy84.01	48
Image Classification	CIFAR-10 (test)	Clean Accuracy92.49	40

Showing 10 of 41 rows

Other info

Follow for update

@wizwand_team Discord