Guessing Smart: Biased Sampling for Efficient Black-Box Adversarial Attacks

About

We consider adversarial examples for image classification in the black-box decision-based setting. Here, an attacker cannot access confidence scores, but only the final label. Most attacks for this scenario are either unreliable or inefficient. Focusing on the latter, we show that a specific class of attacks, Boundary Attacks, can be reinterpreted as a biased sampling framework that gains efficiency from domain knowledge. We identify three such biases, image frequency, regional masks and surrogate gradients, and evaluate their performance against an ImageNet classifier. We show that the combination of these biases outperforms the state of the art by a wide margin. We also showcase an efficient way to attack the Google Cloud Vision API, where we craft convincing perturbations with just a few hundred queries. Finally, the methods we propose have also been found to work very well against strong defenses: Our targeted attack won second place in the NeurIPS 2018 Adversarial Vision Challenge.

Thomas Brunner, Frederik Diehl, Michael Truong Le, Alois Knoll• 2018

Related benchmarks

Task	Dataset	Result
Adversarial Attack	ILSVRC 2012 (val)	Median L2 Distance4.182	112
Adversarial Attack	ILSVRC 2012	Median L2 Distance5.44	96
Adversarial Attack	ImageNet-21K (val)	Median L2 Distance1.263	80
Adversarial Attack	Tiny ImageNet (val)	Median L2 Distance0.23	64
Adversarial Attack	ImageNet 21k (test)	Median L2 Distance4.008	64
Untargeted Attack	ImageNet (test)	Mean L2 Distortion (2K Budget)28.44	42
Targeted Attack	ImageNet (test)	Mean L2 Distortion (2K Budget)35.28	38
Targeted Adversarial Attack	ILSVRC 2012	Median Noise Magnitude67.728	7

Showing 8 of 8 rows

Other info

Follow for update

@wizwand_team Discord