Attacking deep networks with surrogate-based adversarial black-box methods is easy

About

A recent line of work on black-box adversarial attacks has revived the use of transfer from surrogate models by integrating it into query-based search. However, we find that existing approaches of this type underperform their potential, and can be overly complicated besides. Here, we provide a short and simple algorithm which achieves state-of-the-art results through a search which uses the surrogate network's class-score gradients, with no need for other priors or heuristics. The guiding assumption of the algorithm is that the studied networks are in a fundamental sense learning similar functions, and that a transfer attack from one to the other should thus be fairly "easy". This assumption is validated by the extremely low query counts and failure rates achieved: e.g. an untargeted attack on a VGG-16 ImageNet network using a ResNet-152 as the surrogate yields a median query count of 6 at a success rate of 99.9%. Code is available at https://github.com/fiveai/GFCS.

Nicholas A. Lord, Romain Mueller, Luca Bertinetto• 2022

Related benchmarks

Task	Dataset	Result
Untargeted Score-based Black-box Attack	ImageNet	ASR100	96
Targeted Score-based Black-box Attack	ImageNet	ASR95	96
Untargeted Adversarial Attack	ImageNet (test)	--	26
Untargeted Adversarial Attack	VGG-19	Fooling Rate100	5
Untargeted Adversarial Attack	DenseNet-121	Fooling Rate99.9	5
Untargeted Adversarial Attack	ResNext-50	Fooling Rate99.7	5
Targeted Adversarial Attack	DenseNet-121	Fooling Rate95.2	4
Targeted Adversarial Attack	ResNext-50	Fooling Rate92.9	4
Targeted Adversarial Attack	VGG-19	Fooling Rate89.1	4

Showing 9 of 9 rows

Other info

Follow for update

@wizwand_team Discord