Hi-GMAE: Hierarchical Graph Masked Autoencoders
About
Graph Masked Autoencoders (GMAEs) have emerged as a notable self-supervised learning approach for graph-structured data. Existing GMAE models primarily focus on reconstructing node-level information, categorizing them as single-scale GMAEs. This methodology, while effective in certain contexts, tends to overlook the complex hierarchical structures inherent in many real-world graphs. For instance, molecular graphs exhibit a clear hierarchical organization in the form of the atoms-functional groups-molecules structure. Therefore, the inability of single-scale GMAE models to incorporate these hierarchical relationships often results in an inadequate capture of crucial high-level graph information, leading to a noticeable decline in performance. To address this limitation, we propose Hierarchical Graph Masked AutoEncoders (Hi-GMAE), a novel multi-scale GMAE framework designed to handle the hierarchical structures within graphs. First, Hi-GMAE constructs a multi-scale graph hierarchy through graph pooling, enabling the exploration of graph structures across different granularity levels. To ensure masking uniformity of subgraphs across these scales, we propose a novel coarse-to-fine strategy that initiates masking at the coarsest scale and progressively back-projects the mask to finer scales. Furthermore, we integrate a gradual recovery strategy with the masking process to mitigate the learning challenges posed by completely masked subgraphs. Our experiments on 17 graph datasets, covering two graph learning tasks, consistently demonstrate that Hi-GMAE outperforms 29 state-of-the-art self-supervised competitors in capturing comprehensive graph information.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Graph Classification | MUTAG (10-fold cross-validation) | Accuracy81.45 | 219 | |
| Graph Classification | PROTEINS (10-fold cross-validation) | Accuracy83.88 | 214 | |
| Molecular property prediction | MoleculeNet BBBP (scaffold) | ROC AUC72.5 | 140 | |
| Molecular property prediction | MoleculeNet SIDER (scaffold) | ROC-AUC0.62 | 120 | |
| Molecular property prediction | MoleculeNet BACE (scaffold) | ROC-AUC85 | 110 | |
| Graph Classification | NCI1 (10-fold cross-validation) | Accuracy82.21 | 101 | |
| Molecular property prediction | MoleculeNet MUV (scaffold) | ROC-AUC0.775 | 91 | |
| Molecular property prediction | TOXCAST (scaffold) | ROC-AUC65.3 | 75 | |
| Graph Classification | ENZYMES (10-fold cross-validation) | Accuracy52.6 | 75 | |
| Molecular Property Classification | ClinTox (scaffold) | ROC-AUC0.864 | 65 |