Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

WaNet -- Imperceptible Warping-based Backdoor Attack

About

With the thriving of deep learning and the widespread practice of using pre-trained networks, backdoor attacks have become an increasing security threat drawing many research interests in recent years. A third-party model can be poisoned in training to work well in normal conditions but behave maliciously when a trigger pattern appears. However, the existing backdoor attacks are all built on noise perturbation triggers, making them noticeable to humans. In this paper, we instead propose using warping-based triggers. The proposed backdoor outperforms the previous methods in a human inspection test by a wide margin, proving its stealthiness. To make such models undetectable by machine defenders, we propose a novel training mode, called the ``noise mode. The trained networks successfully attack and bypass the state-of-the-art defense methods on standard classification datasets, including MNIST, CIFAR-10, GTSRB, and CelebA. Behavior analyses show that our backdoors are transparent to network inspection, further proving this novel attack mechanism's efficiency.

Anh Nguyen, Anh Tran• 2021

Related benchmarks

TaskDatasetResultRank
Backdoor DefenseCIFAR10 (test)
ASR0.54
322
Image ClassificationImageNet V2 (test)--
216
Image ClassificationImageNet-A (test)--
175
Image ClassificationImageNet-Sketch (test)--
153
Image ClassificationGTSRB
Natural Accuracy96.2
87
Image ClassificationGTSRB
CA95.99
79
Image ClassificationMNIST
Clean Accuracy97
71
Backdoor AttackCIFAR10
Attack Success Rate12.53
70
Backdoor AttackGTSRB
Backdoor Accuracy98.28
59
Backdoor AttackCIFAR-10 (test)
Backdoor Accuracy91.22
54
Showing 10 of 39 rows

Other info

Follow for update