Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

ToolTree: Efficient LLM Agent Tool Planning via Dual-Feedback Monte Carlo Tree Search and Bidirectional Pruning

About

Large Language Model (LLM) agents are increasingly applied to complex, multi-step tasks that require interaction with diverse external tools across various domains. However, current LLM agent tool planning methods typically rely on greedy, reactive tool selection strategies that lack foresight and fail to account for inter-tool dependencies. In this paper, we present ToolTree, a novel Monte Carlo tree search-inspired planning paradigm for tool planning. ToolTree explores possible tool usage trajectories using a dual-stage LLM evaluation and bidirectional pruning mechanism that enables the agent to make informed, adaptive decisions over extended tool-use sequences while pruning less promising branches before and after the tool execution. Empirical evaluations across both open-set and closed-set tool planning tasks on 4 benchmarks demonstrate that ToolTree consistently improves performance while keeping the highest efficiency, achieving an average gain of around 10\% compared to the state-of-the-art planning paradigm.

Shuo Yang, Soyeon Caren Han, Yihao Ding, Shuhe Wang, Eduard Hoy• 2026

Related benchmarks

TaskDatasetResultRank
Visual Question AnsweringGQA
Accuracy74.44
1249
Text-based Visual Question AnsweringTextVQA
Accuracy85.43
807
Multi-hop Question AnsweringHotpotQA--
294
Science Question AnsweringScienceQA (SQA)
Accuracy87.33
273
Mathematical Multimodal ReasoningMathVista
Accuracy65.58
218
Medical Visual Question AnsweringVQA-RAD
Accuracy74.12
198
Medical Question AnsweringMedQA
Accuracy93.88
153
Document Visual Question AnsweringDocVQA
Accuracy92.33
132
Mathematical ReasoningGame of 24
Accuracy47.85
103
Knowledge-based Visual Question AnsweringOKVQA
Accuracy0.5927
79
Showing 10 of 20 rows

Other info

Follow for update