Unlocking Graph Structure Learning with Tree-Guided Large Language Models
About
Recently, the emergence of large language models (LLMs) has motivated integrating language descriptions into graphs, forming text-attributed graphs (TAGs) that enhance model encoding capabilities from a data-centric perspective. A review of prior advancements highlights that graph structure learning (GSL) is a pivotal technique for improving data utility, making it highly relevant to efficient TAG learning. However, most GSL methods are tailored for traditional graphs without textual information, underscoring the necessity of developing a new GSL paradigm. Despite clear motivations, it remains challenging: (1) How can we define a reasonable optimization objective for GSL in the era of LLMs, considering the massive parameters in LLMs? (2) How can we design an efficient model architecture that enables seamless integration of LLMs for this optimization objective? For Question 1, we reformulate existing GSL optimization objectives as a tree optimization framework, shifting the focus from obtaining a well-trained edge predictor to a language-aware tree sampler. For Question 2, we propose decoupled and training-free model design principles for LLM integration, shifting the focus from computation-intensive fine-tuning to more efficient inference. Based on this, we propose Large Language and Tree Assistant (LLaTA), which leverages tree-based LLM in-context learning to enhance the understanding of topology and text, enabling reliable inference and generating improved graph structure. Extensive experiments on 11 datasets demonstrate that LLaTA enjoys flexibility-incorporated with any backbone; scalability-outperforms other LLM-based GSL methods; and effectiveness-achieving SOTA predictive performance across a variety of datasets from different domains.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Node Classification | Cora | Accuracy84.65 | 885 | |
| Node Classification | Citeseer | Accuracy78.21 | 275 | |
| Node Classification | wikiCS | Accuracy81.58 | 198 | |
| Node Classification | amazon-ratings | Accuracy44.47 | 138 | |
| Node Classification | Accuracy67.62 | 66 | ||
| Node Classification | arXiv | Accuracy73.24 | 41 | |
| Node Classification | Accuracy64.53 | 23 | ||
| Node Classification | Children | Accuracy47.26 | 19 |