Explainability-based Backdoor Attacks Against Graph Neural Networks

About

Backdoor attacks represent a serious threat to neural network models. A backdoored model will misclassify the trigger-embedded inputs into an attacker-chosen target label while performing normally on other benign inputs. There are already numerous works on backdoor attacks on neural networks, but only a few works consider graph neural networks (GNNs). As such, there is no intensive research on explaining the impact of trigger injecting position on the performance of backdoor attacks on GNNs. To bridge this gap, we conduct an experimental investigation on the performance of backdoor attacks on GNNs. We apply two powerful GNN explainability approaches to select the optimal trigger injecting position to achieve two attacker objectives -- high attack success rate and low clean accuracy drop. Our empirical results on benchmark datasets and state-of-the-art neural network models demonstrate the proposed method's effectiveness in selecting trigger injecting position for backdoor attacks on GNNs. For instance, on the node classification task, the backdoor attack with trigger injecting position selected by GraphLIME reaches over $84 \%$ attack success rate with less than $2.5 \%$ accuracy drop

Jing Xu, Minhui (Jason) Xue, Stjepan Picek• 2021

Related benchmarks

Task	Dataset	Result
Node Classification	Flickr	Clean Accuracy47.65	76
Node Classification	arXiv	Clean Accuracy66.23	52
Node Classification	Cora	ASR19.38	28
Node Classification	Pubmed	ASR13.24	28
Graph Backdoor Attack	arXiv	ASR21.07	28
Graph Backdoor Attack	Cora	ASR16.71	28
Graph Backdoor Attack	Pubmed	ASR19.25	28
Graph Backdoor Attack	Flickr	ASR16.09	28
Node Classification	Pubmed	Clean Accuracy85.6	24
Node Classification	Cora	Accuracy74.46	24

Showing 10 of 10 rows

Other info

Follow for update

@wizwand_team Discord