Adversarial Robustness with Non-uniform Perturbations

About

Robustness of machine learning models is critical for security related applications, where real-world adversaries are uniquely focused on evading neural network based detectors. Prior work mainly focus on crafting adversarial examples (AEs) with small uniform norm-bounded perturbations across features to maintain the requirement of imperceptibility. However, uniform perturbations do not result in realistic AEs in domains such as malware, finance, and social networks. For these types of applications, features typically have some semantically meaningful dependencies. The key idea of our proposed approach is to enable non-uniform perturbations that can adequately represent these feature dependencies during adversarial training. We propose using characteristics of the empirical data distribution, both on correlations between the features and the importance of the features themselves. Using experimental datasets for malware classification, credit risk prediction, and spam detection, we show that our approach is more robust to real-world attacks. Finally, we present robustness certification utilizing non-uniform perturbation bounds, and show that non-uniform bounds achieve better certification.

Ecenaz Erdemir, Jeffrey Bickford, Luca Melis, Sergul Aydore• 2021

Related benchmarks

Task	Dataset	Result
Malware Detection	EMBER	Clean Accuracy96.3	49
Credit Risk Prediction	German Credit (test)	Clean Accuracy69.7	31
Spam Detection	Twitter Spam dataset (test)	Clean Accuracy94	26
Spam Detection	Twitter Spam Detection (test)	Certification S.R.90.11	10
Spam Detection	Twitter Spam Detection 1000 spammers (test)	Margin2.41	10
PDF Malware Classification	PDFrate-R	Clean Accuracy97.83	3

Showing 6 of 6 rows

Other info

Code

Follow for update

@wizwand_team Discord