Design and Evaluation of a Multi-Domain Trojan Detection Method on Deep Neural Networks
About
This work corroborates a run-time Trojan detection method exploiting STRong Intentional Perturbation of inputs, is a multi-domain Trojan detection defence across Vision, Text and Audio domains---thus termed as STRIP-ViTA. Specifically, STRIP-ViTA is the first confirmed Trojan detection method that is demonstratively independent of both the task domain and model architectures. We have extensively evaluated the performance of STRIP-ViTA over: i) CIFAR10 and GTSRB datasets using 2D CNNs, and a public third party Trojaned model for vision tasks; ii) IMDB and consumer complaint datasets using both LSTM and 1D CNNs for text tasks; and speech command dataset using both 1D CNNs and 2D CNNs for audio tasks. Experimental results based on 28 tested Trojaned models demonstrate that STRIP-ViTA performs well across all nine architectures and five datasets. In general, STRIP-ViTA can effectively detect Trojan inputs with small false acceptance rate (FAR) with an acceptable preset false rejection rate (FRR). In particular, for vision tasks, we can always achieve a 0% FRR and FAR. By setting FRR to be 3%, average FAR of 1.1% and 3.55% are achieved for text and audio tasks, respectively. Moreover, we have evaluated and shown the effectiveness of STRIP-ViTA against a number of advanced backdoor attacks whilst other state-of-the-art methods lose effectiveness in front of one or all of these advanced backdoor attacks.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Sentiment Classification | SST2 (test) | Accuracy89.45 | 214 | |
| Sentiment Classification | IMDB (test) | -- | 144 | |
| Poisoned sample detection | TrojAI round 6 (test) | Precision0.917 | 96 | |
| Sentiment Analysis | SST-2 (test) | Clean Accuracy95.99 | 50 | |
| Text Generation | Medical Chatbot | ASR88.92 | 42 | |
| Backdoor Defense | SST-2 | CACC91.39 | 41 | |
| Text Classification | CR | CA91.45 | 31 | |
| Text Classification | HSOL | CACC95.53 | 26 | |
| Topic Classification | AG's News | CACC91.37 | 24 | |
| Text Classification | SST-2 | CACC91.49 | 20 |