Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

How Attentive are Graph Attention Networks?

About

Graph Attention Networks (GATs) are one of the most popular GNN architectures and are considered as the state-of-the-art architecture for representation learning with graphs. In GAT, every node attends to its neighbors given its own representation as the query. However, in this paper we show that GAT computes a very limited kind of attention: the ranking of the attention scores is unconditioned on the query node. We formally define this restricted kind of attention as static attention and distinguish it from a strictly more expressive dynamic attention. Because GATs use a static attention mechanism, there are simple graph problems that GAT cannot express: in a controlled problem, we show that static attention hinders GAT from even fitting the training data. To remove this limitation, we introduce a simple fix by modifying the order of operations and propose GATv2: a dynamic graph attention variant that is strictly more expressive than GAT. We perform an extensive evaluation and show that GATv2 outperforms GAT across 11 OGB and other benchmarks while we match their parametric costs. Our code is available at https://github.com/tech-srl/how_attentive_are_gats . GATv2 is available as part of the PyTorch Geometric library, the Deep Graph Library, and the TensorFlow GNN library.

Shaked Brody, Uri Alon, Eran Yahav• 2021

Related benchmarks

TaskDatasetResultRank
Graph ClassificationPROTEINS
Accuracy77.7
1252
Node ClassificationCora
Accuracy85.77
1215
Graph ClassificationMUTAG
Accuracy84
1103
Node ClassificationCiteseer
Accuracy73.9
1037
Node ClassificationCora (test)
Mean Accuracy83.02
951
Node ClassificationCiteseer (test)
Accuracy0.7182
945
Node ClassificationChameleon
Accuracy62.2
867
Node ClassificationPubmed
Accuracy79.4
865
Node ClassificationWisconsin
Accuracy55.2
864
Node ClassificationCornell
Accuracy72.5
851
Showing 10 of 236 rows
...

Other info

Code

Follow for update