Boosting Adversarial Transferability by Achieving Flat Local Maxima

About

Transfer-based attack adopts the adversarial examples generated on the surrogate model to attack various models, making it applicable in the physical world and attracting increasing interest. Recently, various adversarial attacks have emerged to boost adversarial transferability from different perspectives. In this work, inspired by the observation that flat local minima are correlated with good generalization, we assume and empirically validate that adversarial examples at a flat local region tend to have good transferability by introducing a penalized gradient norm to the original loss function. Since directly optimizing the gradient regularization norm is computationally expensive and intractable for generating adversarial examples, we propose an approximation optimization method to simplify the gradient update of the objective function. Specifically, we randomly sample an example and adopt a first-order procedure to approximate the curvature of Hessian/vector product, which makes computing more efficient by interpolating two neighboring gradients. Meanwhile, in order to obtain a more stable gradient direction, we randomly sample multiple examples and average the gradients of these examples to reduce the variance due to random sampling during the iterative process. Extensive experimental results on the ImageNet-compatible dataset show that the proposed method can generate adversarial examples at flat local regions, and significantly improve the adversarial transferability on either normally trained models or adversarially trained models than the state-of-the-art attacks. Our codes are available at: https://github.com/Trustworthy-AI-Group/PGN.

Zhijin Ge, Hongying Liu, Xiaosen Wang, Fanhua Shang, Yuanyuan Liu• 2023

Related benchmarks

Task	Dataset	Result
Adversarial Attack	ImageNet (val)	ASR (General)56.98	222
Untargeted Adversarial Attack	CIFAR-10 (test)	ASR86.73	95
Adversarial Attack Transferability	ImageNet	Transfer Success Rate (Target: VGG16)81.31	93
Adversarial Attack Transferability	ImageNet-1k (val)	ASR (VGG16)36.3	93
Adversarial Attack Transferability	ImageNet (test)	VGG16 Accuracy18.84	93
Adversarial Robustness	CIFAR-10 (test)	Attack Success Rate (ASR)78.8	76
Targeted Adversarial Attack	CIFAR-10	ASR5.4	43
Untargeted Adversarial Attack	ImageNet-Compatible	Inc-v3 Performance100	24
Black-box Adversarial Attack	ImageNet (test)	Success Rate (Res34)100	13
Untargeted Adversarial Attack	ImageNet	ComDefend Robustness93.7	6

Showing 10 of 12 rows

Other info

Code

Follow for update

@wizwand_team Discord