Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

SurFree: a fast surrogate-free black-box attack

About

Machine learning classifiers are critically prone to evasion attacks. Adversarial examples are slightly modified inputs that are then misclassified, while remaining perceptively close to their originals. Last couple of years have witnessed a striking decrease in the amount of queries a black box attack submits to the target classifier, in order to forge adversarials. This particularly concerns the black-box score-based setup, where the attacker has access to top predicted probabilites: the amount of queries went from to millions of to less than a thousand. This paper presents SurFree, a geometrical approach that achieves a similar drastic reduction in the amount of queries in the hardest setup: black box decision-based attacks (only the top-1 label is available). We first highlight that the most recent attacks in that setup, HSJA, QEBA and GeoDA all perform costly gradient surrogate estimations. SurFree proposes to bypass these, by instead focusing on careful trials along diverse directions, guided by precise indications of geometrical properties of the classifier decision boundaries. We motivate this geometric approach before performing a head-to-head comparison with previous attacks with the amount of queries as a first class citizen. We exhibit a faster distortion decay under low query amounts (few hundreds to a thousand), while remaining competitive at higher query budgets.

Thibault Maho, Teddy Furon, Erwan Le Merrer• 2020

Related benchmarks

TaskDatasetResultRank
Adversarial AttackILSVRC 2012 (val)
Median L2 Distance4.386
112
Adversarial AttackILSVRC 2012
Median L2 Distance5.05
96
Adversarial AttackImageNet-21K (val)
Median L2 Distance0.949
80
Adversarial AttackTiny ImageNet (val)
Median L2 Distance0.113
64
Adversarial AttackImageNet 21k (test)
Median L2 Distance3.343
64
Untargeted AttackImageNet (test)
Mean L2 Distortion (2K Budget)34.28
42
Targeted AttackImageNet (test)
Mean L2 Distortion (2K Budget)61.31
38
Targeted Adversarial AttackILSVRC 2012
Median Noise Magnitude57.808
7
Showing 8 of 8 rows

Other info

Follow for update