Augmentation-Free Graph Contrastive Learning with Performance Guarantee
About
Graph contrastive learning (GCL) is the most representative and prevalent self-supervised learning approach for graph-structured data. Despite its remarkable success, existing GCL methods highly rely on an augmentation scheme to learn the representations invariant across different augmentation views. In this work, we revisit such a convention in GCL through examining the effect of augmentation techniques on graph data via the lens of spectral theory. We found that graph augmentations preserve the low-frequency components and perturb the middle-and high-frequency components of the graph, which contributes to the success of GCL algorithms on homophilic graphs but hinder its application on heterophilic graphs, due to the high-frequency preference of heterophilic data. Motivated by this, we propose a novel, theoretically-principled, and augmentation-free GCL method, named AF-GCL, that (1) leverages the features aggregated by Graph Neural Network to construct the self-supervision signal instead of augmentations and therefore (2) is less sensitive to the graph homophily degree. Theoretically, We present the performance guarantee for AF-GCL as well as an analysis for understanding the efficacy of AF-GCL. Extensive experiments on 14 benchmark datasets with varying degrees of heterophily show that AF-GCL presents competitive or better performance on homophilic graphs and outperforms all existing state-of-the-art GCL methods on heterophilic graphs with significantly less computational overhead.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Node Classification | Pubmed | Accuracy81.5 | 627 | |
| Node Classification | Cora | Accuracy83.16 | 583 | |
| Node Classification | Amazon Photo | Accuracy92.49 | 313 | |
| Node Classification | Amazon Computers | Accuracy89.68 | 167 | |
| Node Classification | Citeseer | Accuracy71.96 | 51 | |
| Node Classification | Coauthor CS | Accuracy91.92 | 15 |