Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Recipe for a General, Powerful, Scalable Graph Transformer

About

We propose a recipe on how to build a general, powerful, scalable (GPS) graph Transformer with linear complexity and state-of-the-art results on a diverse set of benchmarks. Graph Transformers (GTs) have gained popularity in the field of graph representation learning with a variety of recent publications but they lack a common foundation about what constitutes a good positional or structural encoding, and what differentiates them. In this paper, we summarize the different types of encodings with a clearer definition and categorize them as being $\textit{local}$, $\textit{global}$ or $\textit{relative}$. The prior GTs are constrained to small graphs with a few hundred nodes, here we propose the first architecture with a complexity linear in the number of nodes and edges $O(N+E)$ by decoupling the local real-edge aggregation from the fully-connected Transformer. We argue that this decoupling does not negatively affect the expressivity, with our architecture being a universal function approximator on graphs. Our GPS recipe consists of choosing 3 main ingredients: (i) positional/structural encoding, (ii) local message-passing mechanism, and (iii) global attention mechanism. We provide a modular framework $\textit{GraphGPS}$ that supports multiple types of encodings and that provides efficiency and scalability both in small and large graphs. We test our architecture on 16 benchmarks and show highly competitive results in all of them, show-casing the empirical benefits gained by the modularity and the combination of different strategies.

Ladislav Ramp\'a\v{s}ek, Mikhail Galkin, Vijay Prakash Dwivedi, Anh Tuan Luu, Guy Wolf, Dominique Beaini• 2022

Related benchmarks

TaskDatasetResultRank
Graph ClassificationPROTEINS
Accuracy78.44
1252
Node ClassificationCora
Accuracy83.87
1215
Graph ClassificationMUTAG
Accuracy86.14
1103
Node ClassificationCiteseer
Accuracy65.71
1037
Node ClassificationCora (test)
Mean Accuracy82.84
951
Node ClassificationCiteseer (test)
Accuracy0.7273
945
Node ClassificationChameleon
Accuracy42.54
867
Node ClassificationPubmed
Accuracy88.94
865
Node ClassificationWisconsin
Accuracy62
864
Node ClassificationCornell
Accuracy45.05
851
Showing 10 of 244 rows
...

Other info

Code

Follow for update