Rethinking Semi-Supervised Imbalanced Node Classification from Bias-Variance Decomposition

About

This paper introduces a new approach to address the issue of class imbalance in graph neural networks (GNNs) for learning on graph-structured data. Our approach integrates imbalanced node classification and Bias-Variance Decomposition, establishing a theoretical framework that closely relates data imbalance to model variance. We also leverage graph augmentation technique to estimate the variance, and design a regularization term to alleviate the impact of imbalance. Exhaustive tests are conducted on multiple benchmarks, including naturally imbalanced datasets and public-split class-imbalanced datasets, demonstrating that our approach outperforms state-of-the-art methods in various imbalanced scenarios. This work provides a novel theoretical perspective for addressing the problem of imbalanced node classification in GNNs.

Liang Yan, Gengchen Wei, Chen Yang, Shengzhong Zhang, Zengfeng Huang• 2023

Related benchmarks

Task	Dataset	Result
Node Classification	Cora (semi-supervised)	--	103
Node Classification	CS-Random (test)	Balanced Accuracy90.11	72
Node Classification	Computers Random rho=25.50 (test)	Balanced Accuracy85	33
Node Classification	Citeseer semi-supervised (test)	Accuracy66.04	26
Entity Classification	amazon-user-churn	B-Acc0.6311	11
Entity Classification	event-user-repeat	B-Acc70.02	11
Entity Classification	hm-user-churn	B-Acc56.18	11
Entity Classification	f1-driver top3	Balanced Accuracy (B-Acc)55.79	11
Entity Classification	avito-user-visits	B-Acc50.04	11
Entity Classification	stack-user-engagement	B-Acc58.51	11

Showing 10 of 18 rows

Other info

Code

Follow for update

@wizwand_team Discord