Improving Transferable Targeted Attacks with Feature Tuning Mixup

About

Deep neural networks (DNNs) exhibit vulnerability to adversarial examples that can transfer across different DNN models. A particularly challenging problem is developing transferable targeted attacks that can mislead DNN models into predicting specific target classes. While various methods have been proposed to enhance attack transferability, they often incur substantial computational costs while yielding limited improvements. Recent clean feature mixup methods use random clean features to perturb the feature space but lack optimization for disrupting adversarial examples, overlooking the advantages of attack-specific perturbations. In this paper, we propose Feature Tuning Mixup (FTM), a novel method that enhances targeted attack transferability by combining both random and optimized noises in the feature space. FTM introduces learnable feature perturbations and employs an efficient stochastic update strategy for optimization. These learnable perturbations facilitate the generation of more robust adversarial examples with improved transferability. We further demonstrate that attack performance can be enhanced through an ensemble of multiple FTM-perturbed surrogate models. Extensive experiments on the ImageNet-compatible dataset across various DNN models demonstrate that our method achieves significant improvements over state-of-the-art methods while maintaining low computational cost.

Kaisheng Liang, Xuelong Dai, Yanjie Li, Dong Wang, Bin Xiao• 2024

Related benchmarks

Task	Dataset	Result
Targeted Adversarial Attack	ImageNet-Compatible	Avg Success Rate33.3	73
Targeted Adversarial Attack	ImageNet	VGG-16 Score88.2	39
Targeted Adversarial Attack	ImageNet	Dense-121 Score41.5	31
Targeted Adversarial Attack	ImageNet (val)	ViT Performance1.84e+3	23
Targeted Adversarial Attack	ImageNet (test)	Average Success Rate74.3	17
Adversarial Attack	ImageNet V2	ASR83.2	12
Targeted Adversarial Attack	ImageNet	VGG-16 Robust Accuracy34.2	10
Targeted Adversarial Attack	ImageNet RN-50 Source 1k (val)	ViT Performance Score6.8	10
Targeted Adversarial Attack	ImageNet	VGG-16 Score90	9
Targeted Adversarial Attack	ImageNet-compatible 100 images	Success Count47	5

Showing 10 of 11 rows

Other info

Code

Follow for update

@wizwand_team Discord