ToolChain: Efficient Action Space Navigation in Large Language Models with A Search

About

Large language models (LLMs) have demonstrated powerful decision-making and planning capabilities in solving complicated real-world problems. LLM-based autonomous agents can interact with diverse tools (e.g., functional APIs) and generate solution plans that execute a series of API function calls in a step-by-step manner. The multitude of candidate API function calls significantly expands the action space, amplifying the critical need for efficient action space navigation. However, existing methods either struggle with unidirectional exploration in expansive action spaces, trapped into a locally optimal solution, or suffer from exhaustively traversing all potential actions, causing inefficient navigation. To address these issues, we propose ToolChain*, an efficient tree search-based planning algorithm for LLM-based agents. It formulates the entire action space as a decision tree, where each node represents a possible API function call involved in a solution plan. By incorporating the A* search algorithm with task-specific cost function design, it efficiently prunes high-cost branches that may involve incorrect actions, identifying the most low-cost valid path as the solution. Extensive experiments on multiple tool-use and reasoning tasks demonstrate that ToolChain* efficiently balances exploration and exploitation within an expansive action space. It outperforms state-of-the-art baselines on planning and reasoning tasks by 3.1% and 3.5% on average while requiring 7.35x and 2.31x less time, respectively.

Yuchen Zhuang, Xiang Chen, Tong Yu, Saayan Mitra, Victor Bursztyn, Ryan A. Rossi, Somdeb Sarkhel, Chao Zhang• 2023

Related benchmarks

Task	Dataset	Result
Financial Analysis	MCP-Universe Financial Analysis	Success Rate0.85	24
Location Navigation	MCP-Universe Location Navigation	Success Rate29.97	24
Repository Management	MCP-Universe Repository Management	Success Rate39.39	24
Tool Reasoning	ToolBench (G1)	Pass Rate82.8	24
Tool Reasoning	ToolBench G2	Pass Rate90.5	24
Tool Reasoning	ToolBench (G3)	Pass Rate88.5	24
Tool Planning	GTA	Tool Selection F1 (Step-by-step)74.29	16
Tool Planning	m&m	Tool Selection F1 (Step-by-step)87.17	16

Showing 8 of 8 rows

Other info

Follow for update

@wizwand_team Discord

ToolChain*: Efficient Action Space Navigation in Large Language Models with A* Search

About

Related benchmarks

Other info

Follow for update

ToolChain: Efficient Action Space Navigation in Large Language Models with A Search