Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Boosting Adversarial Transferability by Achieving Flat Local Maxima

About

Transfer-based attack adopts the adversarial examples generated on the surrogate model to attack various models, making it applicable in the physical world and attracting increasing interest. Recently, various adversarial attacks have emerged to boost adversarial transferability from different perspectives. In this work, inspired by the observation that flat local minima are correlated with good generalization, we assume and empirically validate that adversarial examples at a flat local region tend to have good transferability by introducing a penalized gradient norm to the original loss function. Since directly optimizing the gradient regularization norm is computationally expensive and intractable for generating adversarial examples, we propose an approximation optimization method to simplify the gradient update of the objective function. Specifically, we randomly sample an example and adopt a first-order procedure to approximate the curvature of Hessian/vector product, which makes computing more efficient by interpolating two neighboring gradients. Meanwhile, in order to obtain a more stable gradient direction, we randomly sample multiple examples and average the gradients of these examples to reduce the variance due to random sampling during the iterative process. Extensive experimental results on the ImageNet-compatible dataset show that the proposed method can generate adversarial examples at flat local regions, and significantly improve the adversarial transferability on either normally trained models or adversarially trained models than the state-of-the-art attacks. Our codes are available at: https://github.com/Trustworthy-AI-Group/PGN.

Zhijin Ge, Hongying Liu, Xiaosen Wang, Fanhua Shang, Yuanyuan Liu• 2023

Related benchmarks

TaskDatasetResultRank
Adversarial AttackImageNet (val)
ASR (General)56.98
222
Adversarial RobustnessCIFAR-10 (test)
Attack Success Rate (ASR)78.8
76
Untargeted Adversarial AttackCIFAR-10 (test)
ASR86.73
57
Untargeted Adversarial AttackImageNet-Compatible
Inc-v3 Performance100
24
Black-box Adversarial AttackImageNet (test)
Success Rate (Res34)100
13
Untargeted Adversarial AttackImageNet
ComDefend Robustness93.7
6
Untargeted Adversarial AttackImageNet-compatible (test)
Acc (Inc-v3)100
6
Showing 7 of 7 rows

Other info

Code

Follow for update