Transitivity-Preserving Graph Representation Learning for Bridging Local Connectivity and Role-based Similarity
About
Graph representation learning (GRL) methods, such as graph neural networks and graph transformer models, have been successfully used to analyze graph-structured data, mainly focusing on node classification and link prediction tasks. However, the existing studies mostly only consider local connectivity while ignoring long-range connectivity and the roles of nodes. In this paper, we propose Unified Graph Transformer Networks (UGT) that effectively integrate local and global structural information into fixed-length vector representations. First, UGT learns local structure by identifying the local substructures and aggregating features of the $k$-hop neighborhoods of each node. Second, we construct virtual edges, bridging distant nodes with structural similarity to capture the long-range dependencies. Third, UGT learns unified representations through self-attention, encoding structural distance and $p$-step transition probability between node pairs. Furthermore, we propose a self-supervised learning task that effectively learns transition probability to fuse local and global structural features, which could then be transferred to other downstream tasks. Experimental results on real-world benchmark datasets over various downstream tasks showed that UGT significantly outperformed baselines that consist of state-of-the-art models. In addition, UGT reaches the expressive power of the third-order Weisfeiler-Lehman isomorphism test (3d-WL) in distinguishing non-isomorphic graph pairs. The source code is available at https://github.com/NSLab-CUK/Unified-Graph-Transformer.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Node Classification | Cora | Accuracy88.74 | 885 | |
| Node Classification | Citeseer | Accuracy76.08 | 804 | |
| Graph Classification | PROTEINS | Accuracy80.12 | 742 | |
| Node Classification | Chameleon | Accuracy69.78 | 549 | |
| Node Classification | Squirrel | Accuracy66.96 | 500 | |
| Graph Classification | NCI1 | Accuracy77.55 | 460 | |
| Node Classification | Cornell | Accuracy70 | 426 | |
| Node Classification | Texas | Accuracy0.8667 | 410 | |
| Node Classification | Wisconsin | Accuracy81.6 | 410 | |
| Graph Classification | ENZYMES | Accuracy67.22 | 305 |