Bridging Mode Connectivity in Loss Landscapes and Adversarial Robustness

About

Mode connectivity provides novel geometric insights on analyzing loss landscapes and enables building high-accuracy pathways between well-trained neural networks. In this work, we propose to employ mode connectivity in loss landscapes to study the adversarial robustness of deep neural networks, and provide novel methods for improving this robustness. Our experiments cover various types of adversarial attacks applied to different network architectures and datasets. When network models are tampered with backdoor or error-injection attacks, our results demonstrate that the path connection learned using limited amount of bonafide data can effectively mitigate adversarial effects while maintaining the original accuracy on clean data. Therefore, mode connectivity provides users with the power to repair backdoored or error-injected models. We also use mode connectivity to investigate the loss landscapes of regular and robust models against evasion attacks. Experiments show that there exists a barrier in adversarial robustness loss on the path connecting regular and adversarially-trained models. A high correlation is observed between the adversarial robustness loss and the largest eigenvalue of the input Hessian matrix, for which theoretical justifications are provided. Our results suggest that mode connectivity offers a holistic tool and practical means for evaluating and improving adversarial robustness.

Pu Zhao, Pin-Yu Chen, Payel Das, Karthikeyan Natesan Ramamurthy, Xue Lin• 2020

Related benchmarks

Task	Dataset	Result
Backdoor Defense	CIFAR10 (test)	ASR4.52	333
Backdoor Defense	GTSRB (test)	ASR0.12	138
Backdoor Defense	CIFAR10 (train)	ASR1.14	63
Backdoor Defense	CIFAR-10	Total Time (s)3.45e+3	14
Backdoor Defense	CIFAR-10 0.1% clean data (50 images) (train)	ACC (Badnets)80.21	12
Backdoor Defense	CIFAR-10 10% clean data, Badnets attack (train)	Accuracy89.17	6
Backdoor Defense	CIFAR-10 10% clean data Blend attack (train)	Accuracy89.45	6
Backdoor Defense	CIFAR-10 IAB-one attack 10% (train)	Accuracy87.01	6
Backdoor Defense	CIFAR-10 IAB-all attack clean data 10% (train)	Accuracy88.53	6
Backdoor Defense	CIFAR-10 10% clean data, CLB attack (train)	Accuracy89.78	6

Showing 10 of 13 rows

Other info

Follow for update

@wizwand_team Discord