GNNAutoScale: Scalable and Expressive Graph Neural Networks via Historical Embeddings

About

We present GNNAutoScale (GAS), a framework for scaling arbitrary message-passing GNNs to large graphs. GAS prunes entire sub-trees of the computation graph by utilizing historical embeddings from prior training iterations, leading to constant GPU memory consumption in respect to input node size without dropping any data. While existing solutions weaken the expressive power of message passing due to sub-sampling of edges or non-trainable propagations, our approach is provably able to maintain the expressive power of the original GNN. We achieve this by providing approximation error bounds of historical embeddings and show how to tighten them in practice. Empirically, we show that the practical realization of our framework, PyGAS, an easy-to-use extension for PyTorch Geometric, is both fast and memory-efficient, learns expressive node representations, closely resembles the performance of their non-scaling counterparts, and reaches state-of-the-art performance on large-scale graphs.

Matthias Fey, Jan E. Lenssen, Frank Weichert, Jure Leskovec• 2021

Related benchmarks

Task	Dataset	Result
Node Classification	Cora	Macro-F143.45	30
Node Classification	Citeseer	F1 Score39.72	27
AML Node Classification	Synthetic AML HI-Small	Average F1 Score54.36	12
AML Node Classification	Synthetic AML HI-Medium	Average F156.12	12
AML Node Classification	Synthetic AML LI-Small	Average F1 Score16.14	12
AML Node Classification	Synthetic AML LI-Medium	Avg F1 Score0.1129	12
AML Node Classification	Synthetic AML LI-Large	Average F1 Score0.00e+0	12
AML Node Classification	Synthetic AML HI-Large	Average F1 Score52.1	12
Node Classification	Pubmed	Average F163.47	8
Node Classification	MSAcademic	Average F181.68	8

Showing 10 of 10 rows

Other info

Follow for update

@wizwand_team Discord