Agent World Model: Infinity Synthetic Environments for Agentic Reinforcement Learning

About

Recent advances in large language model (LLM) have empowered autonomous agents to perform multi-turn interactions with tools and environments. However, scaling such agent training is limited by the lack of diverse and reliable environments. In this paper, we propose Agent World Model (AWM), a fully synthetic environment generation pipeline. Using this pipeline, we scale to 1,000 environments covering everyday scenarios, in which agents can interact with rich toolsets and obtain high-quality observations. Notably, these environments are code-driven and backed by databases, providing more reliable and consistent state transitions than environments simulated by LLMs. Moreover, they enable more efficient agent interaction compared with collecting trajectories from realistic environments. To demonstrate the effectiveness of this resource, we perform large-scale reinforcement learning for multi-turn tool-use agents. Thanks to the fully executable environments and accessible database states, we can also design reliable reward functions. Experiments on three benchmarks show that training exclusively in synthetic environments, rather than benchmark-specific ones, yields strong out-of-distribution generalization. The code is available at https://github.com/Snowflake-Labs/agent-world-model.

Zhaoyang Wang, Canwen Xu, Boyi Liu, Yite Wang, Siwei Han, Zhewei Yao, Huaxiu Yao, Yuxiong He• 2026

Related benchmarks

Task	Dataset	Result
Function Calling	BFCL V3	Overall Accuracy70.18	104
Agentic Tool-use	tau2-Bench	Retail Score63.6	59
Interactive Tool-Use Agent Performance	VitaBench	Delivery Score22	44
Agentic Workflow Success	τ2-bench	Airline Success Rate38.5	43
Agentic Task Success	MCP-Universe	Financial Success Score35	41
Function Calling	BFCL v4	Multi-Turn Success Rate37.6	32
Tool Use	MCPMark	Total Success Rate5.1	31
Multi-Turn Tool Calling	τ2-bench	Airline Score30	19
Agentic Task Completion	τ2-bench	Airline Success Rate32	19
Tool Use	MCP-Atlas	Pass Rate6.19	19

Showing 10 of 21 rows

Other info

GitHub

Follow for update

@wizwand_team Discord