Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

GraphMaker: Can Diffusion Models Generate Large Attributed Graphs?

About

Large-scale graphs with node attributes are increasingly common in various real-world applications. Creating synthetic, attribute-rich graphs that mirror real-world examples is crucial, especially for sharing graph data for analysis and developing learning models when original data is restricted to be shared. Traditional graph generation methods are limited in their capacity to handle these complex structures. Recent advances in diffusion models have shown potential in generating graph structures without attributes and smaller molecular graphs. However, these models face challenges in generating large attributed graphs due to the complex attribute-structure correlations and the large size of these graphs. This paper introduces a novel diffusion model, GraphMaker, specifically designed for generating large attributed graphs. We explore various combinations of node attribute and graph structure generation processes, finding that an asynchronous approach more effectively captures the intricate attribute-structure correlations. We also address scalability issues through edge mini-batching generation. To demonstrate the practicality of our approach in graph data dissemination, we introduce a new evaluation pipeline. The evaluation demonstrates that synthetic graphs generated by GraphMaker can be used to develop competitive graph machine learning models for the tasks defined over the original graphs without actually accessing these graphs, while many leading graph generation methods fall short in this evaluation.

Mufei Li, Eleonora Krea\v{c}i\'c, Vamsi K. Potluru, Pan Li• 2023

Related benchmarks

TaskDatasetResultRank
Synthetic Network GenerationYouTube
GFDL1 Score7.97
29
Synthetic Network GenerationDBLP
GFDL117.45
29
Synthetic Network GenerationPolBlogs
GFDL1 Score7.88
29
Synthetic Network GenerationYelp
GFDL1 Link Prediction Accuracy5.17
25
Synthetic Network GenerationYouTube
Eigenvalue MMD23.13
14
Graph generationCiteseer
Degree0.56
13
Synthetic Network GenerationDBLP
Transitivity7.98
11
Synthetic Network GenerationPolBlogs
Triangle Similarity (MMD)3.27
11
Synthetic Network GenerationYouTube
Link Prediction Accuracy Ratio99
10
Synthetic Network GenerationDBLP
Link Prediction Accuracy95
10
Showing 10 of 15 rows

Other info

Follow for update