Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models
About
While language models (LMs) have shown potential across a range of decision-making tasks, their reliance on simple acting processes limits their broad deployment as autonomous agents. In this paper, we introduce Language Agent Tree Search (LATS) -- the first general framework that synergizes the capabilities of LMs in reasoning, acting, and planning. By leveraging the in-context learning ability of LMs, we integrate Monte Carlo Tree Search into LATS to enable LMs as agents, along with LM-powered value functions and self-reflections for proficient exploration and enhanced decision-making. A key feature of our approach is the incorporation of an environment for external feedback, which offers a more deliberate and adaptive problem-solving mechanism that surpasses the constraints of existing techniques. Our experimental evaluation across diverse domains, including programming, interactive question-answering (QA), web navigation, and math, validates the effectiveness and generality of LATS in decision-making while maintaining competitive or improved reasoning performance. Notably, LATS achieves state-of-the-art pass@1 accuracy (92.7%) for programming on HumanEval with GPT-4 and demonstrates gradient-free performance (average score of 75.9) comparable to gradient-based fine-tuning for web navigation on WebShop with GPT-3.5. Code can be found at https://github.com/lapisrocks/LanguageAgentTreeSearch
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Code Generation | HumanEval (test) | Pass@192.7 | 444 | |
| Knowledge Graph Question Answering | CWQ | -- | 105 | |
| Code Generating | MBPP | Pass@181.1 | 88 | |
| Web navigation and task completion | WebArena (test) | Average Task Completion22.5 | 42 | |
| Code Generation | APPS Intermediate | Pass Rate45.86 | 32 | |
| Financial Analysis | MCP-Universe Financial Analysis | Success Rate0.825 | 24 | |
| Location Navigation | MCP-Universe Location Navigation | Success Rate28.86 | 24 | |
| Repository Management | MCP-Universe Repository Management | Success Rate36.36 | 24 | |
| Tool Reasoning | ToolBench (G1) | Pass Rate80.3 | 24 | |
| Tool Reasoning | ToolBench G2 | Pass Rate88.5 | 24 |