GraphMAE: Self-Supervised Masked Graph Autoencoders

About

Self-supervised learning (SSL) has been extensively explored in recent years. Particularly, generative SSL has seen emerging success in natural language processing and other AI fields, such as the wide adoption of BERT and GPT. Despite this, contrastive learning-which heavily relies on structural data augmentation and complicated training strategies-has been the dominant approach in graph SSL, while the progress of generative SSL on graphs, especially graph autoencoders (GAEs), has thus far not reached the potential as promised in other fields. In this paper, we identify and examine the issues that negatively impact the development of GAEs, including their reconstruction objective, training robustness, and error metric. We present a masked graph autoencoder GraphMAE that mitigates these issues for generative self-supervised graph pretraining. Instead of reconstructing graph structures, we propose to focus on feature reconstruction with both a masking strategy and scaled cosine error that benefit the robust training of GraphMAE. We conduct extensive experiments on 21 public datasets for three different graph learning tasks. The results manifest that GraphMAE-a simple graph autoencoder with careful designs-can consistently generate outperformance over both contrastive and generative state-of-the-art baselines. This study provides an understanding of graph autoencoders and demonstrates the potential of generative self-supervised pre-training on graphs.

Zhenyu Hou, Xiao Liu, Yukuo Cen, Yuxiao Dong, Hongxia Yang, Chunjie Wang, Jie Tang• 2022

Related benchmarks

Task	Dataset	Result
Graph Classification	PROTEINS	Accuracy75.8	1252
Node Classification	Cora	Accuracy84.2	1215
Graph Classification	MUTAG	Accuracy89.3	1103
Node Classification	Citeseer	Accuracy73.4	1037
Node Classification	Cora (test)	Mean Accuracy73.1	951
Node Classification	Chameleon	Accuracy79.5	867
Node Classification	Pubmed	Accuracy81.1	865
Node Classification	Wisconsin	Accuracy74.71	864
Node Classification	Cornell	Accuracy75.14	851
Node Classification	Texas	Accuracy0.7432	801

Showing 10 of 187 rows

...

Other info

Code

Follow for update

@wizwand_team Discord