Unlocking Graph Structure Learning with Tree-Guided Large Language Models

About

Recently, the emergence of large language models (LLMs) has motivated integrating language descriptions into graphs, forming text-attributed graphs (TAGs) that enhance model encoding capabilities from a data-centric perspective. A review of prior advancements highlights that graph structure learning (GSL) is a pivotal technique for improving data utility, making it highly relevant to efficient TAG learning. However, most GSL methods are tailored for traditional graphs without textual information, underscoring the necessity of developing a new GSL paradigm. Despite clear motivations, it remains challenging: (1) How can we define a reasonable optimization objective for GSL in the era of LLMs, considering the massive parameters in LLMs? (2) How can we design an efficient model architecture that enables seamless integration of LLMs for this optimization objective? For Question 1, we reformulate existing GSL optimization objectives as a tree optimization framework, shifting the focus from obtaining a well-trained edge predictor to a language-aware tree sampler. For Question 2, we propose decoupled and training-free model design principles for LLM integration, shifting the focus from computation-intensive fine-tuning to more efficient inference. Based on this, we propose Large Language and Tree Assistant (LLaTA), which leverages tree-based LLM in-context learning to enhance the understanding of topology and text, enabling reliable inference and generating improved graph structure. Extensive experiments on 11 datasets demonstrate that LLaTA enjoys flexibility-incorporated with any backbone; scalability-outperforms other LLM-based GSL methods; and effectiveness-achieving SOTA predictive performance across a variety of datasets from different domains.

Zhihan Zhang, Xunkai Li, Lei Zhu, Guang Zeng, Bowen Fan, Yanzhe Wen, Hongchao Qin, Rong-Hua Li, Guoren Wang• 2025

Related benchmarks

Task	Dataset	Result
Node Classification	Cora	Accuracy84.65	1215
Node Classification	Citeseer	Accuracy78.21	393
Node Classification	wikiCS	Accuracy81.58	317
Node Classification	arXiv	Accuracy73.24	219
Node Classification	REDDIT	Accuracy67.62	192
Node Classification	amazon-ratings	Accuracy44.47	173
Node Classification	Instagram	Accuracy64.53	34
Node Classification	Children	Accuracy47.26	19

Showing 8 of 8 rows

Other info

Follow for update

@wizwand_team Discord