Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Minimally distorted Adversarial Examples with a Fast Adaptive Boundary Attack

About

The evaluation of robustness against adversarial manipulation of neural networks-based classifiers is mainly tested with empirical attacks as methods for the exact computation, even when available, do not scale to large networks. We propose in this paper a new white-box adversarial attack wrt the $l_p$-norms for $p \in \{1,2,\infty\}$ aiming at finding the minimal perturbation necessary to change the class of a given input. It has an intuitive geometric meaning, yields quickly high quality results, minimizes the size of the perturbation (so that it returns the robust accuracy at every threshold with a single run). It performs better or similar to state-of-the-art attacks which are partially specialized to one $l_p$-norm, and is robust to the phenomenon of gradient masking.

Francesco Croce, Matthias Hein• 2019

Related benchmarks

TaskDatasetResultRank
Adversarial AttackMNIST (test)
Median ||δ||p0.138
21
Polyp DetectionKvasir 1.0 (test)
Precision95.5
12
Polyp DetectionIn-house 1.0 (test)
Precision90.1
12
Adversarial AttackMNIST
Avg Latency (ms)8.88
6
Adversarial AttackCIFAR10 (test)
Median ||δ||p4.79
5
Adversarial AttackCIFAR10
Avg Query Time (ms)108.9
3
Showing 6 of 6 rows

Other info

Follow for update