HopSkipJumpAttack: A Query-Efficient Decision-Based Attack

About

The goal of a decision-based adversarial attack on a trained model is to generate adversarial examples based solely on observing output labels returned by the targeted model. We develop HopSkipJumpAttack, a family of algorithms based on a novel estimate of the gradient direction using binary information at the decision boundary. The proposed family includes both untargeted and targeted attacks optimized for $\ell_2$ and $\ell_\infty$ similarity metrics respectively. Theoretical analysis is provided for the proposed algorithms and the gradient direction estimate. Experiments show HopSkipJumpAttack requires significantly fewer model queries than Boundary Attack. It also achieves competitive performance in attacking several widely-used defense mechanisms. (HopSkipJumpAttack was named Boundary Attack++ in a previous version of the preprint.)

Jianbo Chen, Michael I. Jordan, Martin J. Wainwright• 2019

Related benchmarks

Task	Dataset	Result
Object Hallucination Evaluation	POPE	--	2019
Black-box Attack	LSUN	ASR95.8	189
Black-box Attack	GenImage	ASR99.2	162
Targeted Adversarial Attack	ImageNet ILSVRC2012 (val)	Median L2 Distance78.626	120
Targeted Adversarial Attack	ImageNet (val)	Median L2 Distance76.727	120
Adversarial Attack	ILSVRC 2012 (val)	Median L2 Distance24.181	112
Targeted Black-box Adversarial Attack	ImageNet	Average L2 Norm10.715	96
Non-targeted Black-box Adversarial Attack	ImageNet	Average L2 Norm2.55	96
Adversarial Attack	ILSVRC 2012	Median L2 Distance17.75	96
Adversarial Attack	ImageNet-21K (val)	Median L2 Distance4.367	80

Showing 10 of 42 rows

Other info

Follow for update

@wizwand_team Discord