ColorFool: Semantic Adversarial Colorization

About

Adversarial attacks that generate small L_p-norm perturbations to mislead classifiers have limited success in black-box settings and with unseen classifiers. These attacks are also not robust to defenses that use denoising filters and to adversarial training procedures. Instead, adversarial attacks that generate unrestricted perturbations are more robust to defenses, are generally more successful in black-box settings and are more transferable to unseen classifiers. However, unrestricted perturbations may be noticeable to humans. In this paper, we propose a content-based black-box adversarial attack that generates unrestricted perturbations by exploiting image semantics to selectively modify colors within chosen ranges that are perceived as natural by humans. We show that the proposed approach, ColorFool, outperforms in terms of success rate, robustness to defense frameworks and transferability, five state-of-the-art adversarial attacks on two different tasks, scene and object classification, when attacking three state-of-the-art deep neural networks using three standard datasets. The source code is available at https://github.com/smartcameras/ColorFool.

Ali Shahin Shamsabadi, Ricardo Sanchez-Matilla, Andrea Cavallaro• 2019

Related benchmarks

Task	Dataset	Result
Adversarial Attack	ImageNet (val)	--	222
Adversarial Attack	ImageNet (test)	Success Rate90.4	107
Adversarial Attack	ImageNet-compatible Stable Diffusion context v1.4 (test)	ASR (MN-v2)93.3	38
Targeted Transfer Attack	ImageNet (val)	Attack Success Rate99.2	25
Adversarial Attack	ImageNet-Compatible	HGD Score9.1	19
Image Quality Assessment	ImageNet (test)	NIMA Score (AVA)5.24	11
Black-box Adversarial Attack	ImageNet	Top-1 Accuracy (JPEG)42.1	7
Image Quality Assessment	ImageNet	NIMA Technical Score4.918	7

Showing 8 of 8 rows

Other info

Follow for update

@wizwand_team Discord