Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Learning Noise-Resilient and Transferable Graph-Text Alignment via Dynamic Quality Assessment

About

Pre-training Graph Foundation Models (GFMs) on text-attributed graphs (TAGs) is central to web-scale applications such as search, recommendation, and knowledge discovery. However, existing CLIP-style graph-text aligners face two key limitations: they assume strict one-to-one correspondences between nodes and texts, overlooking the inherent many-to-many relations in real-world graphs; and they rely on static alignment objectives that cannot adapt to varying data quality, making them brittle under noisy supervision. Together, these limitations expose a core dilemma: embracing expressive many-to-many alignment amplifies noise, while reverting to strict one-to-one strategies sacrifices semantic diversity and fails to handle inherently mismatched pairs. To address these challenges, we propose ADAligner, a dynamic, quality-aware graph-text alignment framework that dynamically adjusts between expressive many-to-many and conservative one-to-one objectives according to supervision quality. ADAligner estimates batch-level alignment reliability in real time and adapts its optimization accordingly, promoting soft, subgraph-level many-to-many alignment when supervision is clean, while emphasizing reliable one-to-one alignment by dynamically filtering low-confidence pairs under noise. Theoretically, we prove that this dynamic mechanism forms a stable negative feedback process, ensuring convergence and robustness. Comprehensive experiments on nine diverse TAG datasets demonstrate that ADAligner consistently outperforms prior graph-text aligners on zero-/few-shot node classification, link prediction and cross-modal retrieval tasks. It maintains strong robustness under noisy supervision and accelerates pre-training by approximately 2 to 3 times compared to multimodal baselines, establishing a scalable and reliable foundation for graph-text representation learning in real-world web environments.

Yuhang Liu, Minglai Shao, Zengyi Wo, Yunlong Chu, Bing Hao, Shengzhong Liu, Ruijie Wang, Jianxin Li• 2025

Related benchmarks

TaskDatasetResultRank
Node ClassificationCora
Accuracy60.51
583
Link PredictionCora
AUC (Cora)77.44
60
Node ClassificationInstagram
Accuracy54.84
60
Node ClassificationBooks-History
Accuracy51.94
17
Node ClassificationEle-Computers
Accuracy50.45
17
Node ClassificationEle-Photo
Accuracy41.06
14
Link PredictionwikiCS
AUC73.33
13
Link PredictionHistory
AUC70.45
7
Showing 8 of 8 rows

Other info

Follow for update