Pure Transformers are Powerful Graph Learners

About

We show that standard Transformers without graph-specific modifications can lead to promising results in graph learning both in theory and practice. Given a graph, we simply treat all nodes and edges as independent tokens, augment them with token embeddings, and feed them to a Transformer. With an appropriate choice of token embeddings, we prove that this approach is theoretically at least as expressive as an invariant graph network (2-IGN) composed of equivariant linear layers, which is already more expressive than all message-passing Graph Neural Networks (GNN). When trained on a large-scale graph dataset (PCQM4Mv2), our method coined Tokenized Graph Transformer (TokenGT) achieves significantly better results compared to GNN baselines and competitive results compared to Transformer variants with sophisticated graph-specific inductive bias. Our implementation is available at https://github.com/jw9730/tokengt.

Jinwoo Kim, Tien Dat Nguyen, Seonwoo Min, Sungjun Cho, Moontae Lee, Honglak Lee, Seunghoon Hong• 2022

Related benchmarks

Task	Dataset	Result
Node Classification	Chameleon	Accuracy38.1	867
Node Classification	Squirrel	Accuracy29.4	786
Graph Classification	NCI1	Accuracy76.7	658
Node Classification	Citeseer	Accuracy47	503
Graph Classification	IMDB-B	Accuracy80.2	425
Graph Classification	IMDB-M	Accuracy47	425
Graph Classification	DD	Accuracy73.9	300
Graph Classification	NCI109	Accuracy72.1	267
Graph Regression	ZINC (test)	MAE0.047	218
Node Classification	Cora	Accuracy45.6	134

Showing 10 of 43 rows

Other info

Code

Follow for update

@wizwand_team Discord