Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Admix: Enhancing the Transferability of Adversarial Attacks

About

Deep neural networks are known to be extremely vulnerable to adversarial examples under white-box setting. Moreover, the malicious adversaries crafted on the surrogate (source) model often exhibit black-box transferability on other models with the same learning task but having different architectures. Recently, various methods are proposed to boost the adversarial transferability, among which the input transformation is one of the most effective approaches. We investigate in this direction and observe that existing transformations are all applied on a single image, which might limit the adversarial transferability. To this end, we propose a new input transformation based attack method called Admix that considers the input image and a set of images randomly sampled from other categories. Instead of directly calculating the gradient on the original input, Admix calculates the gradient on the input image admixed with a small portion of each add-in image while using the original label of the input to craft more transferable adversaries. Empirical evaluations on standard ImageNet dataset demonstrate that Admix could achieve significantly better transferability than existing input transformation methods under both single model setting and ensemble-model setting. By incorporating with existing input transformations, our method could further improve the transferability and outperforms the state-of-the-art combination of input transformations by a clear margin when attacking nine advanced defense models under ensemble-model setting. Code is available at https://github.com/JHL-HUST/Admix.

Xiaosen Wang, Xuanran He, Jingdong Wang, Kun He• 2021

Related benchmarks

TaskDatasetResultRank
Adversarial AttackImageNet (test)--
101
Untargeted Adversarial AttackCIFAR-10 (test)
ASR59.6
95
Adversarial Attack TransferabilityImageNet-1k (val)
ASR (VGG16)33.39
93
Adversarial Attack TransferabilityImageNet
Transfer Success Rate (Target: VGG16)74.07
93
Adversarial Attack TransferabilityImageNet (test)
VGG16 Accuracy16.97
93
Image ClassificationCXR14
AUC0.82
76
Targeted Adversarial AttackImageNet-Compatible
Avg Success Rate73.2
73
Targeted Adversarial AttackCIFAR-10
ASR3.8
43
Adversarial Attack TransferabilityImageNet-Compatible
Transferability on ViT99.8
29
Adversarial AttackImageNet ILSVRC2012 (val)
Robust Accuracy (Inception v3)100
24
Showing 10 of 16 rows

Other info

Follow for update