LPNL: Scalable Link Prediction with Large Language Models
About
Exploring the application of large language models (LLMs) to graph learning is a emerging endeavor. However, the vast amount of information inherent in large graphs poses significant challenges to this process. This work focuses on the link prediction task and introduces $\textbf{LPNL}$ (Link Prediction via Natural Language), a framework based on large language models designed for scalable link prediction on large-scale heterogeneous graphs. We design novel prompts for link prediction that articulate graph details in natural language. We propose a two-stage sampling pipeline to extract crucial information from the graphs, and a divide-and-conquer strategy to control the input tokens within predefined limits, addressing the challenge of overwhelming information. We fine-tune a T5 model based on our self-supervised learning designed for link prediction. Extensive experimental results demonstrate that LPNL outperforms multiple advanced baselines in link prediction tasks on large-scale graphs.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Author Name Disambiguation | OAG Computer Science (CS) (test) | NDCG98.5 | 6 | |
| Author Name Disambiguation | OAG Material Science (Mater) (test) | NDCG0.954 | 6 | |
| Author Name Disambiguation | OAG Engineering (Engin) (test) | NDCG97.7 | 6 | |
| Author Name Disambiguation | OAG Chemistry (test) | NDCG0.955 | 6 |