A Self-supervised Approach for Adversarial Robustness

About

Adversarial examples can cause catastrophic mistakes in Deep Neural Network (DNNs) based vision systems e.g., for classification, segmentation and object detection. The vulnerability of DNNs against such attacks can prove a major roadblock towards their real-world deployment. Transferability of adversarial examples demand generalizable defenses that can provide cross-task protection. Adversarial training that enhances robustness by modifying target model's parameters lacks such generalizability. On the other hand, different input processing based defenses fall short in the face of continuously evolving attacks. In this paper, we take the first step to combine the benefits of both approaches and propose a self-supervised adversarial training mechanism in the input space. By design, our defense is a generalizable approach and provides significant robustness against the \textbf{unseen} adversarial attacks (\eg by reducing the success rate of translation-invariant \textbf{ensemble} attack from 82.6\% to 31.9\% in comparison to previous state-of-the-art). It can be deployed as a plug-and-play solution to protect a variety of vision systems, as we demonstrate for the case of classification, segmentation and detection. Code is available at: {\small\url{https://github.com/Muzammal-Naseer/NRP}}.

Muzammal Naseer, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, Fatih Porikli• 2020

Related benchmarks

Task	Dataset	Result
Image Classification	SVHN (test)	--	199
Visual Reasoning	NLVR2	--	58
Visual Entailment	SNLI-VE	Accuracy0.1511	31
Image Captioning	MSCOCO (test)	CIDEr23.54	29
Adversarial Attack	ImageNet	BERTScore F1 (c=0.002)0.949	21
REC	RefCOCO+	ASR69.6	16
REC	RefCOCOg	ASR69.26	16
Adversarial Attack	Cityscapes (test)	ASR5.9	12
Adversarial Attack	SA-1B (test)	ASR4.08	12
Adversarial Attack	ADE20K (test)	ASR0.67	11

Showing 10 of 18 rows

Other info

Follow for update

@wizwand_team Discord