Can Graph Learning Improve Planning in LLM-based Agents?
About
Task planning in language agents is emerging as an important research topic alongside the development of large language models (LLMs). It aims to break down complex user requests in natural language into solvable sub-tasks, thereby fulfilling the original requests. In this context, the sub-tasks can be naturally viewed as a graph, where the nodes represent the sub-tasks, and the edges denote the dependencies among them. Consequently, task planning is a decision-making problem that involves selecting a connected path or subgraph within the corresponding graph and invoking it. In this paper, we explore graph learning-based methods for task planning, a direction that is orthogonal to the prevalent focus on prompt design. Our interest in graph learning stems from a theoretical discovery: the biases of attention and auto-regressive loss impede LLMs' ability to effectively navigate decision-making on graphs, which is adeptly addressed by graph neural networks (GNNs). This theoretical insight led us to integrate GNNs with LLMs to enhance overall performance. Extensive experiments demonstrate that GNN-based methods surpass existing solutions even without training, and minimal training can further enhance their performance. The performance gain increases with a larger task graph size.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Task Planning | Hugging Face | Node-F177.79 | 25 | |
| Task Planning | TaskBench Multimedia | Node F188.54 | 25 | |
| Task Planning | RestBench TMDB | Node F182.63 | 25 | |
| Task Planning | TaskBench Daily Life | Node-F197.35 | 25 | |
| Planning | UltraTool (test) | n-F172.81 | 24 | |
| Graph Node Classification and Link Prediction | HuggingFace | n-F178.76 | 23 | |
| Graph Node Classification and Link Prediction | Multimedia | n-F188.86 | 23 | |
| Graph Node Classification and Link Prediction | Daily Life | n-F197.42 | 23 | |
| Task Planning | Hugging Face v1 (test) | n-F177.79 | 17 | |
| Task Planning | TaskBench Multimedia v1 (test) | n-F188.54 | 14 |