Local Augmentation for Graph Neural Networks
About
Graph Neural Networks (GNNs) have achieved remarkable performance on graph-based tasks. The key idea for GNNs is to obtain informative representation through aggregating information from local neighborhoods. However, it remains an open question whether the neighborhood information is adequately aggregated for learning representations of nodes with few neighbors. To address this, we propose a simple and efficient data augmentation strategy, local augmentation, to learn the distribution of the node features of the neighbors conditioned on the central node's feature and enhance GNN's expressive power with generated features. Local augmentation is a general framework that can be applied to any GNN model in a plug-and-play manner. It samples feature vectors associated with each node from the learned conditional distribution as additional input for the backbone model at each training iteration. Extensive experiments and analyses show that local augmentation consistently yields performance improvement when applied to various GNN architectures across a diverse set of benchmarks. For example, experiments show that plugging in local augmentation to GCN and GAT improves by an average of 3.4\% and 1.6\% in terms of test accuracy on Cora, Citeseer, and Pubmed. Besides, our experimental results on large graphs (OGB) show that our model consistently improves performance over backbones. Code is available at https://github.com/SongtaoLiu0823/LAGNN.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Node Classification | Cora (semi-supervised) | Accuracy85.42 | 103 | |
| Node Classification | Cite semi-supervised | Accuracy74.83 | 61 | |
| Node Classification | PubMed semi-supervised | Accuracy81.73 | 42 | |
| Node Classification | Physics semi-supervised | Accuracy94.52 | 30 | |
| Node Classification | CS semi-supervised | Accuracy92.71 | 30 | |
| Node Classification | CORA inductive setting (test) | Accuracy82.7 | 22 | |
| Node Classification | CITESEER inductive setting (test) | Accuracy73 | 21 | |
| Semi-supervised node classification | Ogbn-arxiv | Accuracy0.6996 | 20 | |
| Node Classification | ogbn-arxiv full-supervised 100% training size | Accuracy73.77 | 15 | |
| Node Classification | Flickr semi-supervised 5% training size | Accuracy50.82 | 15 |