Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Supervised Graph Contrastive Learning for Gene Regulatory Networks

About

Graph Contrastive Learning (GCL) is a powerful self-supervised learning framework that performs data augmentation through graph perturbations, with growing applications in the analysis of biological networks such as Gene Regulatory Networks (GRNs). The artificial perturbations commonly used in GCL, such as node dropping, induce structural changes that can diverge from biological reality. This concern has contributed to a broader trend in graph representation learning toward augmentation-free methods, which view such structural changes as problematic and should be avoided. However, this trend overlooks the fundamental insight that structural changes from biologically meaningful perturbations are not a problem to be avoided, but rather a rich source of information, thereby ignoring the valuable opportunity to leverage data from real biological experiments. Motivated by this insight, we propose SupGCL (Supervised Graph Contrastive Learning), a new GCL method for GRNs that directly incorporates biological perturbations from gene knockdown experiments as supervision. SupGCL is a probabilistic formulation that continuously generalizes conventional GCL, linking artificial augmentations with real perturbations measured in knockdown experiments, and using the latter as explicit supervision. On patient-derived GRNs from three cancer types, we train GRN representations with SupGCL and evaluate it in two regimes: (i) embedding space analysis, where it yields clearer disease-subtype structure and improves clustering, and (ii) task-specific fine-tuning, where it consistently outperforms strong graph representation learning baselines on 13 downstream tasks spanning gene-level functional annotation and patient-level prediction.

Sho Oshima, Yuji Okamoto, Taisei Tosaki, Ryosuke Kojima• 2025

Related benchmarks

TaskDatasetResultRank
Node ClassificationColorectal Cancer BP (test)
Macro F1 Score0.547
12
BP ClassificationBreast Cancer GRN
Subset Accuracy24.3
11
Hazard PredictionBreast Cancer GRN
C-Index0.65
11
Biological Process (BP) classificationBreast cancer dataset Node-level
Jaccard Index50
6
Biological Process PredictionTCGA Breast Cancer BP
Subset Accuracy24.3
6
Biological Process PredictionTCGA Lung Cancer BP
Subset Acc28.2
6
Biological Process PredictionTCGA Colorectal Cancer BP
Subset Accuracy26.2
6
Cellular Component (CC) classificationBreast cancer dataset Node-level
Jaccard Index42.2
6
Cellular Component (CC) classificationLung cancer dataset Node-level
Jaccard Index39.2
6
Cellular Component (CC) classificationColorectal cancer dataset Node-level
Jaccard Index39.5
6
Showing 10 of 29 rows

Other info

Follow for update