Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Graph Chain-of-Thought: Augmenting Large Language Models by Reasoning on Graphs

About

Large language models (LLMs), while exhibiting exceptional performance, suffer from hallucinations, especially on knowledge-intensive tasks. Existing works propose to augment LLMs with individual text units retrieved from external knowledge corpora to alleviate the issue. However, in many domains, texts are interconnected (e.g., academic papers in a bibliographic graph are linked by citations and co-authorships) which form a (text-attributed) graph. The knowledge in such graphs is encoded not only in single texts/nodes but also in their associated connections. To facilitate the research of augmenting LLMs with graphs, we manually construct a Graph Reasoning Benchmark dataset called GRBench, containing 1,740 questions that can be answered with the knowledge from 10 domain graphs. Then, we propose a simple and effective framework called Graph Chain-of-thought (Graph-CoT) to augment LLMs with graphs by encouraging LLMs to reason on the graph iteratively. Each Graph-CoT iteration consists of three sub-steps: LLM reasoning, LLM-graph interaction, and graph execution. We conduct systematic experiments with three LLM backbones on GRBench, where Graph-CoT outperforms the baselines consistently. The code is available at https://github.com/PeterGriffinJin/Graph-CoT.

Bowen Jin, Chulin Xie, Jiawei Zhang, Kashob Kumar Roy, Yu Zhang, Zheng Li, Ruirui Li, Xianfeng Tang, Suhang Wang, Yu Meng, Jiawei Han• 2024

Related benchmarks

TaskDatasetResultRank
Node ClassificationComputers
Accuracy62.6
85
Graph ReasoningGRBENCH Legal
QwenScore55.5
32
Graph ReasoningGRBENCH Academic
QwenScore0.629
32
Graph ReasoningGRBENCH Healthcare
QwenScore44.1
32
Graph ReasoningGRBENCH Literature
QwenScore56.3
32
Graph ReasoningGRBENCH E-Commerce
QwenScore0.47
32
Node ClassificationSports
Accuracy63.8
30
Node ClassificationOGB-Arxiv In-Domain
Accuracy53.1
27
Medical ReasoningNEEMRs
Recall45.43
22
Medical ReasoningXMEMRs
Recall40.64
22
Showing 10 of 20 rows

Other info

Code

Follow for update