Graph is a Natural Regularization: Revisiting Vector Quantization for Graph Representation Learning
About
Vector Quantization (VQ) has recently emerged as a promising approach for learning compressed and discrete representations for graph-structured data. However, a fundamental challenge, i.e., codebook collapse, remains underexplored in the graph domain, significantly limiting the expressiveness and generalization of graph tokens.In this paper, we present an empirical study and observe that codebook collapse consistently occurs when training VQ jointly with Graph Neural Networks under graph reconstruction tasks, even with mitigation strategies proposed in vision or language domains. Moreover, we provide a diagnosis of collapse from data and optimization perspectives, showing that collapse is associated with graph data properties such as feature redundancy and connectivity density, and is further reinforced by the training dynamics of deterministic hard assignment. To address these issues, we propose RGVQ, a novel framework that integrates graph topology and feature similarity as explicit regularization signals to enhance codebook utilization and promote token diversity. RGVQ introduces soft assignments via Gumbel-Softmax reparameterization, ensuring that all codewords receive gradient updates. In addition, RGVQ incorporates a structure-aware contrastive regularization to penalize assigning the same token to dissimilar node pairs. Extensive experiments demonstrate that RGVQ substantially improves codebook utilization and consistently boosts the performance of state-of-the-art graph VQ backbones across multiple downstream tasks, enabling more expressive and transferable graph token representations.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Node Classification | Pubmed | Accuracy86.54 | 627 | |
| Node Classification | Cora | Accuracy88.34 | 583 | |
| Node Classification | amazon-ratings | Accuracy55.16 | 309 | |
| Link Prediction | WN18RR | -- | 219 | |
| Node Classification | Computer | Accuracy95.67 | 159 | |
| Graph Classification | HIV | ROC-AUC0.741 | 155 | |
| Node Classification | Photo | Accuracy97.66 | 153 | |
| Node Classification | questions | ROC AUC0.7826 | 127 | |
| Link Classification | FB15k-237 | Accuracy90.45 | 97 | |
| Node Classification | wikiCS | Accuracy83.58 | 55 |