Evaluation Metrics for Graph Generative Models: Problems, Pitfalls, and Practical Solutions
About
Graph generative models are a highly active branch of machine learning. Given the steady development of new models of ever-increasing complexity, it is necessary to provide a principled way to evaluate and compare them. In this paper, we enumerate the desirable criteria for such a comparison metric and provide an overview of the status quo of graph generative model comparison in use today, which predominantly relies on the maximum mean discrepancy (MMD). We perform a systematic evaluation of MMD in the context of graph generative model comparison, highlighting some of the challenges and pitfalls researchers inadvertently may encounter. After conducting a thorough analysis of the behaviour of MMD on synthetically-generated perturbed graphs as well as on recently-proposed graph generative models, we are able to provide a suitable procedure to mitigate these challenges and pitfalls. We aggregate our findings into a list of practical recommendations for researchers to use when evaluating graph generative models.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Graph generation | Triangle Grid | MMD RBF0.009 | 12 | |
| Graph Generative Modeling | PTC (test) | Degree Distribution Error1.00e-4 | 12 | |
| Graph Generative Modeling | Mutag (test) | Degree Distribution3.00e-4 | 12 | |
| Graph generation | Lobster | MMD RBF0.04 | 6 | |
| Graph generation | ogbg-molbbbp | MMD RBF0.002 | 6 | |
| Graph generation | PTC | MMD RBF0.04 | 6 | |
| Graph generation | PROTEIN | MMD RBF0.04 | 6 |