Rethinking Graph Transformers with Spectral Attention

About

In recent years, the Transformer architecture has proven to be very successful in sequence processing, but its application to other data structures, such as graphs, has remained limited due to the difficulty of properly defining positions. Here, we present the $\textit{Spectral Attention Network}$ (SAN), which uses a learned positional encoding (LPE) that can take advantage of the full Laplacian spectrum to learn the position of each node in a given graph. This LPE is then added to the node features of the graph and passed to a fully-connected Transformer. By leveraging the full spectrum of the Laplacian, our model is theoretically powerful in distinguishing graphs, and can better detect similar sub-structures from their resonance. Further, by fully connecting the graph, the Transformer does not suffer from over-squashing, an information bottleneck of most GNNs, and enables better modeling of physical phenomenons such as heat transfer and electric interaction. When tested empirically on a set of 4 standard datasets, our model performs on par or better than state-of-the-art GNNs, and outperforms any attention-based model by a wide margin, becoming the first fully-connected architecture to perform well on graph benchmarks.

Devin Kreuzer, Dominique Beaini, William L. Hamilton, Vincent L\'etourneau, Prudencio Tossou• 2021

Related benchmarks

Task	Dataset	Result
Graph Classification	PROTEINS	Accuracy68.47	1252
Node Classification	Cora	Accuracy84.81	1215
Node Classification	Citeseer	Accuracy73.99	1037
Node Classification	Chameleon	Accuracy64.02	867
Node Classification	Pubmed	Accuracy88.22	865
Node Classification	Wisconsin	Accuracy82.66	864
Node Classification	Cornell	Accuracy79.62	851
Node Classification	Texas	Accuracy0.8518	801
Node Classification	Squirrel	Accuracy46.28	786
Graph Classification	NCI1	Accuracy59.31	658

Showing 10 of 107 rows

...

Other info

Code

Follow for update

@wizwand_team Discord