FairGen: Towards Fair Graph Generation

About

There have been tremendous efforts over the past decades dedicated to the generation of realistic graphs in a variety of domains, ranging from social networks to computer networks, from gene regulatory networks to online transaction networks. Despite the remarkable success, the vast majority of these works are unsupervised in nature and are typically trained to minimize the expected graph reconstruction loss, which would result in the representation disparity issue in the generated graphs, i.e., the protected groups (often minorities) contribute less to the objective and thus suffer from systematically higher errors. In this paper, we aim to tailor graph generation to downstream mining tasks by leveraging label information and user-preferred parity constraints. In particular, we start from the investigation of representation disparity in the context of graph generative models. To mitigate the disparity, we propose a fairness-aware graph generative model named FairGen. Our model jointly trains a label-informed graph generation module and a fair representation learning module by progressively learning the behaviors of the protected and unprotected groups, from the `easy' concepts to the `hard' ones. In addition, we propose a generic context sampling strategy for graph generative models, which is proven to be capable of fairly capturing the contextual information of each group with a high probability. Experimental results on seven real-world data sets, including web-based graphs, demonstrate that FairGen (1) obtains performance on par with state-of-the-art graph generative models across nine network properties, (2) mitigates the representation disparity issues in the generated graphs, and (3) substantially boosts the model performance by up to 17% in downstream tasks via data augmentation.

Lecheng Zheng, Dawei Zhou, Hanghang Tong, Jiejun Xu, Yada Zhu, Jingrui He• 2023

Related benchmarks

Task	Dataset	Result
Node Classification	German	Accuracy74.04	40
Node Classification	NBA	H(Acc, ΔDP, ΔEO)3.6595	17
Node Classification	Pokec-n (test)	--	16
Node Classification	NBA (synthetic)	Accuracy70.52	9
Link Prediction	NBA	Micro-F130.1	9
Link Prediction	German	Micro-F163	9
Node Classification	Pokec-z (synthetic)	Accuracy67.89	8
Node Classification	Pokec-z	H(Acc, ΔDP, ΔEO)0.3247	8
Node Classification	Pokec-n (synthetic)	Accuracy55.18	8
Node Classification	NBA	Accuracy55.46	8

Showing 10 of 15 rows

Other info

Follow for update

@wizwand_team Discord