Invisible Backdoor Attack against Self-supervised Learning

About

Self-supervised learning (SSL) models are vulnerable to backdoor attacks. Existing backdoor attacks that are effective in SSL often involve noticeable triggers, like colored patches or visible noise, which are vulnerable to human inspection. This paper proposes an imperceptible and effective backdoor attack against self-supervised models. We first find that existing imperceptible triggers designed for supervised learning are less effective in compromising self-supervised models. We then identify this ineffectiveness is attributed to the overlap in distributions between the backdoor and augmented samples used in SSL. Building on this insight, we design an attack using optimized triggers disentangled with the augmented transformation in the SSL, while remaining imperceptible to human vision. Experiments on five datasets and six SSL algorithms demonstrate our attack is highly effective and stealthy. It also has strong resistance to existing backdoor defenses. Our code can be found at https://github.com/Zhang-Henry/INACTIVE.

Hanrong Zhang, Zhenting Wang, Boheng Li, Fulin Lin, Tingxu Han, Mingyu Jin, Chenlu Zhan, Mengnan Du, Hongwei Wang, Shiqing Ma• 2024

Related benchmarks

Task	Dataset	Result
Image Classification	ImageNet V2 (test)	--	232
Image Classification	ImageNet-A (test)	--	177
Backdoor Attack	CIFAR10	Attack Success Rate99.58	158
Image Classification	ImageNet-Sketch (test)	--	153
Backdoor Attack	GTSRB	Attack Success Rate98.73	142
Image Classification	GTSRB	--	87
Backdoor Attack	CIFAR-10 (test)	Backdoor Accuracy93.01	54
Image-Text Retrieval	COCO (test)	Recall@139.46	41
Backdoor Attack Stealthiness Evaluation	CIFAR10	SSIM0.9633	40
Backdoor Attack	SVHN	Attack Success Rate99.76	27

Showing 10 of 28 rows

Other info

Code

Follow for update

@wizwand_team Discord