Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Universal Litmus Patterns: Revealing Backdoor Attacks in CNNs

About

The unprecedented success of deep neural networks in many applications has made these networks a prime target for adversarial exploitation. In this paper, we introduce a benchmark technique for detecting backdoor attacks (aka Trojan attacks) on deep convolutional neural networks (CNNs). We introduce the concept of Universal Litmus Patterns (ULPs), which enable one to reveal backdoor attacks by feeding these universal patterns to the network and analyzing the output (i.e., classifying the network as `clean' or `corrupted'). This detection is fast because it requires only a few forward passes through a CNN. We demonstrate the effectiveness of ULPs for detecting backdoor attacks on thousands of networks with different architectures trained on four benchmark datasets, namely the German Traffic Sign Recognition Benchmark (GTSRB), MNIST, CIFAR10, and Tiny-ImageNet. The codes and train/test models for this paper can be found here https://umbcvision.github.io/Universal-Litmus-Patterns/.

Soheil Kolouri, Aniruddha Saha, Hamed Pirsiavash, Heiko Hoffmann• 2019

Related benchmarks

TaskDatasetResultRank
Trojan DetectionCIFAR-10
True Positives (TP)1
22
Backdoor DetectionGTSRB WaNet Attack (test)
AUC0.9265
15
Backdoor DetectionGTSRB BppAttack (test)
AUC0.9232
15
Backdoor DetectionGTSRB SIG attack (test)
AUC83.07
15
Trojaned Model DetectionMNIST Resnet18 (test)
Accuracy71
5
Trojaned Model DetectionMNIST LeNet5 (test)
Accuracy58
5
Trojaned Model DetectionCIFAR10 Resnet18 (test)
Accuracy56
5
Trojaned Model DetectionCIFAR10 Densenet121 (test)
Accuracy55
5
Trojan DetectionIARPA/NIST TrojAI DenseNet Round 1
ACC63
4
Trojan DetectionIARPA/NIST TrojAI ResNet Round 1
Accuracy63
4
Showing 10 of 11 rows

Other info

Follow for update