Towards Evaluating the Robustness of Neural Networks

About

Neural networks provide state-of-the-art results for most machine learning tasks. Unfortunately, neural networks are vulnerable to adversarial examples: given an input $x$ and any target classification $t$, it is possible to find a new input $x'$ that is similar to $x$ but classified as $t$. This makes it difficult to apply neural networks in security-critical areas. Defensive distillation is a recently proposed approach that can take an arbitrary neural network, and increase its robustness, reducing the success rate of current attacks' ability to find adversarial examples from $95\%$ to $0.5\%$. In this paper, we demonstrate that defensive distillation does not significantly increase the robustness of neural networks by introducing three new attack algorithms that are successful on both distilled and undistilled neural networks with $100\%$ probability. Our attacks are tailored to three distance metrics used previously in the literature, and when compared to previous adversarial example generation algorithms, our attacks are often much more effective (and never worse). Furthermore, we propose using high-confidence adversarial examples in a simple transferability test we show can also be used to break defensive distillation. We hope our attacks will be used as a benchmark in future defense attempts to create neural networks that resist adversarial examples.

Nicholas Carlini, David Wagner• 2016

Related benchmarks

Task	Dataset	Result
Adversarial Attack Transferability	CIFAR-100	Average ASR84.18	138
Untargeted Adversarial Attack	CIFAR-10 (test)	ASR100	95
Adversarial Attack	Mini-ImageNet	Attack Success Rate81.74	64
Untargeted Adversarial Attack	ImageNet-1k (val)	ASR99.27	57
Adversarial Attack	CIFAR-100	ASR (Average)84.18	56
Adversarial Attack	CIFAR10	ASR58.14	50
Few-Shot Class-Incremental Learning	MiniImagenet	Avg Accuracy60.1	45
Adversarial Attack	MSTAR DARPA (test)	ASR85.08	42
Adversarial Attack	CIFAR100	ASR79.17	38
Adversarial Attack	CIFAR-10	ASR70.26	32

Showing 10 of 39 rows

Other info

Follow for update

@wizwand_team Discord