Enhancing Fine-Tuning Based Backdoor Defense with Sharpness-Aware Minimization

About

Backdoor defense, which aims to detect or mitigate the effect of malicious triggers introduced by attackers, is becoming increasingly critical for machine learning security and integrity. Fine-tuning based on benign data is a natural defense to erase the backdoor effect in a backdoored model. However, recent studies show that, given limited benign data, vanilla fine-tuning has poor defense performance. In this work, we provide a deep study of fine-tuning the backdoored model from the neuron perspective and find that backdoorrelated neurons fail to escape the local minimum in the fine-tuning process. Inspired by observing that the backdoorrelated neurons often have larger norms, we propose FTSAM, a novel backdoor defense paradigm that aims to shrink the norms of backdoor-related neurons by incorporating sharpness-aware minimization with fine-tuning. We demonstrate the effectiveness of our method on several benchmark datasets and network architectures, where it achieves state-of-the-art defense performance. Overall, our work provides a promising avenue for improving the robustness of machine learning models against backdoor attacks.

Mingli Zhu, Shaokui Wei, Li Shen, Yanbo Fan, Baoyuan Wu• 2023

Related benchmarks

Task	Dataset	Result
Backdoor Defense	CIFAR10 (test)	--	327
Backdoor Detection	CIFAR-10	--	135
Backdoor Defense	CIFAR-10 (test)	--	70
Backdoor Detection	GTSRB	--	48
Backdoor Defense	CIFAR-10 Blended v1 (test)	Clean Accuracy92.28	34
Backdoor Defense	CIFAR-10 BadNet v1 (test)	Clean Accuracy91.53	20
Backdoor Defense	CIFAR-10	BadNet C-Acc92.63	17
Backdoor Defense	CIFAR-10 LC DenseNet-161 (test)	Clean Accuracy87.79	17
Backdoor Defense	CIFAR-10 SSBA DenseNet-161 (test)	Clean Accuracy86.98	17
Backdoor Defense	CIFAR-10 SSBA v1 (test)	Clean Accuracy91.77	17

Showing 10 of 18 rows

Other info

Follow for update

@wizwand_team Discord