Enhancing the Transferability of Adversarial Attacks through Variance Tuning

About

Deep neural networks are vulnerable to adversarial examples that mislead the models with imperceptible perturbations. Though adversarial attacks have achieved incredible success rates in the white-box setting, most existing adversaries often exhibit weak transferability in the black-box setting, especially under the scenario of attacking models with defense mechanisms. In this work, we propose a new method called variance tuning to enhance the class of iterative gradient based attack methods and improve their attack transferability. Specifically, at each iteration for the gradient calculation, instead of directly using the current gradient for the momentum accumulation, we further consider the gradient variance of the previous iteration to tune the current gradient so as to stabilize the update direction and escape from poor local optima. Empirical results on the standard ImageNet dataset demonstrate that our method could significantly improve the transferability of gradient-based adversarial attacks. Besides, our method could be used to attack ensemble models or be integrated with various input transformations. Incorporating variance tuning with input transformations on iterative gradient-based attacks in the multi-model setting, the integrated method could achieve an average success rate of 90.1% against nine advanced defense methods, improving the current best attack performance significantly by 85.1% . Code is available at https://github.com/JHL-HUST/VT.

Xiaosen Wang, Kun He• 2021

Related benchmarks

Task	Dataset	Result
Black-box Adversarial Attack	CtrSVDD	ASR93.27	240
Adversarial Attack	ImageNet (val)	ASR (General)100	222
Adversarial Attack	ImageNet	Attack Success Rate75	178
Adversarial Attack	ImageNet (test)	Success Rate38.6	107
Untargeted Adversarial Attack	CIFAR-10 (test)	ASR68.05	95
Adversarial Attack Transferability	ImageNet (test)	VGG16 Accuracy15	93
Adversarial Attack Transferability	ImageNet-1k (val)	ASR (VGG16)24.09	93
Adversarial Attack Transferability	ImageNet	Transfer Success Rate (Target: VGG16)55.36	93
Speech Recognition	LibriSpeech (test)	WER0.4275	76
Adversarial Attack	ImageNet-1K	Inc-v3ens349.9	48

Showing 10 of 38 rows

Other info

Code

Follow for update

@wizwand_team Discord