Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Can Graph Learning Improve Planning in LLM-based Agents?

About

Task planning in language agents is emerging as an important research topic alongside the development of large language models (LLMs). It aims to break down complex user requests in natural language into solvable sub-tasks, thereby fulfilling the original requests. In this context, the sub-tasks can be naturally viewed as a graph, where the nodes represent the sub-tasks, and the edges denote the dependencies among them. Consequently, task planning is a decision-making problem that involves selecting a connected path or subgraph within the corresponding graph and invoking it. In this paper, we explore graph learning-based methods for task planning, a direction that is orthogonal to the prevalent focus on prompt design. Our interest in graph learning stems from a theoretical discovery: the biases of attention and auto-regressive loss impede LLMs' ability to effectively navigate decision-making on graphs, which is adeptly addressed by graph neural networks (GNNs). This theoretical insight led us to integrate GNNs with LLMs to enhance overall performance. Extensive experiments demonstrate that GNN-based methods surpass existing solutions even without training, and minimal training can further enhance their performance. The performance gain increases with a larger task graph size.

Xixi Wu, Yifei Shen, Caihua Shan, Kaitao Song, Siwei Wang, Bohang Zhang, Jiarui Feng, Hong Cheng, Wei Chen, Yun Xiong, Dongsheng Li• 2024

Related benchmarks

TaskDatasetResultRank
Task PlanningHugging Face
Node-F177.79
25
Task PlanningTaskBench Multimedia
Node F188.54
25
Task PlanningRestBench TMDB
Node F182.63
25
Task PlanningTaskBench Daily Life
Node-F197.35
25
PlanningUltraTool (test)
n-F172.81
24
Graph Node Classification and Link PredictionHuggingFace
n-F178.76
23
Graph Node Classification and Link PredictionMultimedia
n-F188.86
23
Graph Node Classification and Link PredictionDaily Life
n-F197.42
23
Task PlanningHugging Face v1 (test)
n-F177.79
17
Task PlanningTaskBench Multimedia v1 (test)
n-F188.54
14
Showing 10 of 13 rows

Other info

Code

Follow for update