Diffusion-Jump GNNs: Homophiliation via Learnable Metric Filters

About

High-order Graph Neural Networks (HO-GNNs) have been developed to infer consistent latent spaces in the heterophilic regime, where the label distribution is not correlated with the graph structure. However, most of the existing HO-GNNs are hop-based, i.e., they rely on the powers of the transition matrix. As a result, these architectures are not fully reactive to the classification loss and the achieved structural filters have static supports. In other words, neither the filters' supports nor their coefficients can be learned with these networks. They are confined, instead, to learn combinations of filters. To address the above concerns, we propose Diffusion-jump GNNs a method relying on asymptotic diffusion distances that operates on jumps. A diffusion-pump generates pairwise distances whose projections determine both the support and coefficients of each structural filter. These filters are called jumps because they explore a wide range of scales in order to find bonds between scattered nodes with the same label. Actually, the full process is controlled by the classification loss. Both the jumps and the diffusion distances react to classification errors (i.e. they are learnable). Homophiliation, i.e., the process of learning piecewise smooth latent spaces in the heterophilic regime, is formulated as a Dirichlet problem: the known labels determine the border nodes and the diffusion-pump ensures a minimal deviation of the semi-supervised grouping from a canonical unsupervised grouping. This triggers the update of both the diffusion distances and, consequently, the jumps in order to minimize the classification error. The Dirichlet formulation has several advantages. It leads to the definition of structural heterophily, a novel measure beyond edge heterophily. It also allows us to investigate links with (learnable) diffusion distances, absorbing random walks and stochastic diffusion.

Ahmed Begga, Francisco Escolano, Miguel Angel Lozano, Edwin R. Hancock• 2023

Related benchmarks

Task	Dataset	Result
Node Classification	Pubmed	Accuracy89.19	819
Node Classification	Chameleon	Accuracy80.48	640
Node Classification	Texas (48/32/20)	Mean Accuracy92.43	78
Node Classification	Wisconsin (48/32/20)	Mean Accuracy92.54	66
Node Classification	Cornell (48/32/20)	Mean Accuracy87.03	66
Node Classification	Citeseer (48/32/20)	Mean Accuracy (%)77.5	66
Node Classification	Cora (48/32/20)	Mean Accuracy88.43	50
Node Classification	Actor (48/32/20)	Mean Accuracy36.93	50
Node Classification	Squirrel (48/32/20)	Mean Accuracy73.48	40

Showing 9 of 9 rows

Other info

Code

Follow for update

@wizwand_team Discord