Toward General and Robust LLM-enhanced Text-attributed Graph Learning

About

Recent advancements in Large Language Models (LLMs) and the proliferation of Text-Attributed Graphs (TAGs) across various domains have positioned LLM-enhanced TAG learning as a critical research area. By utilizing rich graph descriptions, this paradigm leverages LLMs to generate high-quality embeddings, thereby enhancing the representational capacity of Graph Neural Networks (GNNs). However, the field faces significant challenges: (1) the absence of a unified framework to systematize the diverse optimization perspectives arising from the complex interactions between LLMs and GNNs, and (2) the lack of a robust method capable of handling real-world TAGs, which often suffer from texts and edge sparsity, leading to suboptimal performance. To address these challenges, we propose UltraTAG, a unified pipeline for LLM-enhanced TAG learning. UltraTAG provides a unified comprehensive and domain-adaptive framework that not only organizes existing methodologies but also paves the way for future advancements in the field. Building on this framework, we propose UltraTAG-S, a robust instantiation of UltraTAG designed to tackle the inherent sparsity issues in real-world TAGs. UltraTAG-S employs LLM-based text propagation and text augmentation to mitigate text sparsity, while leveraging LLM-augmented node selection techniques based on PageRank and edge reconfiguration strategies to address edge sparsity. Our extensive experiments demonstrate that UltraTAG-S significantly outperforms existing baselines, achieving improvements of 2.12\% and 17.47\% in ideal and sparse settings, respectively. Moreover, as the data sparsity ratio increases, the performance improvement of UltraTAG-S also rises, which underscores the effectiveness and robustness of UltraTAG-S.

Zihao Zhang, Xunkai Li, Rong-Hua Li, Zhenjun Li, Bing Zhou, Guoren Wang• 2025

Related benchmarks

Task	Dataset	Result
Node Classification	Cora (test)	Mean Accuracy90.96	951
Node Classification	Cora	Accuracy88.34	609
Node Classification	Pubmed	Accuracy92.41	501
Node Classification	Photo	Accuracy84.7	285
Node Classification	Reddit (test)	Accuracy63.78	201
Node Classification	PubMed (test)	Accuracy92.41	198
Node Classification	Photo	Accuracy84.69	153
Node Classification	Wiki-CS (test)	Accuracy83.05	146
Node Classification	Citeseer	Accuracy (%)77.52	105
Node Classification	Instagram (test)	Accuracy66.69	39

Showing 10 of 11 rows

Other info

Follow for update

@wizwand_team Discord