Why Self-Inconsistency Arises in GNN Explanations and How to Exploit It

About

Recent work has observed that explanations produced by Self-Interpretable Graph Neural Networks (SI-GNNs) can be self-inconsistent: when the model is reapplied to its own explanatory graph subset, it may produce a different explanation. However, why self-inconsistency arises remains poorly understood. In this work, we first identify re-explanation-induced context perturbation as the direct cause of score variation. We then introduce a latent signal assignment hypothesis to explain why only some edges are sensitive to this perturbation, and analyze how conciseness regularization affects latent signal assignment. Given that self-inconsistent edges do not provide stable evidence for the model's prediction, we propose Self-Denoising (SD), a model-agnostic and training-free post-processing strategy that calibrates explanations with only one additional forward pass. Experiments across representative SI-GNN frameworks, backbone architectures, and benchmark datasets support our hypothesis and show that SD consistently improves explanation quality while adding only about 4--6\% computational overhead in practice.

Wenxin Tai, Yaqian Liu, Ting Zhong, Fan Zhou• 2026

Related benchmarks

Task	Dataset	Result
Interpretable Graph Classification	3MR	AUC99.92	24
Interpretable Graph Classification	BENZENE	AUC0.9381	24
Interpretable Graph Classification	Mutagenicity	AUC99.22	24
Graph Interpretation	BA-2MOTIFS	AUC99.83	12
Interpretable Graph Classification	BA-2MOTIFS	AUC99.88	12

Showing 5 of 5 rows

Other info

Follow for update

@wizwand_team Discord