QuSecNets: Quantization-based Defense Mechanism for Securing Deep Neural Network against Adversarial Attacks
About
Adversarial examples have emerged as a significant threat to machine learning algorithms, especially to the convolutional neural networks (CNNs). In this paper, we propose two quantization-based defense mechanisms, Constant Quantization (CQ) and Trainable Quantization (TQ), to increase the robustness of CNNs against adversarial examples. CQ quantizes input pixel intensities based on a "fixed" number of quantization levels, while in TQ, the quantization levels are "iteratively learned during the training phase", thereby providing a stronger defense mechanism. We apply the proposed techniques on undefended CNNs against different state-of-the-art adversarial attacks from the open-source \textit{Cleverhans} library. The experimental results demonstrate 50%-96% and 10%-50% increase in the classification accuracy of the perturbed images generated from the MNIST and the CIFAR-10 datasets, respectively, on commonly used CNN (Conv2D(64, 8x8) - Conv2D(128, 6x6) - Conv2D(128, 5x5) - Dense(10) - Softmax()) available in \textit{Cleverhans} library.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Question Answering | OpenBookQA | Accuracy43.6 | 126 | |
| Backdoor Defense | Code Injection (test) | ASR30.1 | 22 | |
| Sentiment Steering | Sentiment Steering LLaMA2-13B-Chat (test) | BadNets77.69 | 11 | |
| Sentiment Steering | Sentiment Steering Mistral-7B-Instruct 0.1 (test) | ASR (BadNets)89.06 | 11 | |
| Targeted Refusal | Targeted Refusal LLaMA2-7B-Chat (test) | BadNets68.32 | 11 | |
| Sentiment Steering | Sentiment Steering LLaMA2-7B-Chat (test) | BadNets31.5 | 11 | |
| Targeted Refusal | Targeted Refusal LLaMA2-13B-Chat (test) | BadNets93.21 | 11 |