Towards General Agentic Intelligence via Environment Scaling

About

Advanced agentic intelligence is a prerequisite for deploying Large Language Models in practical, real-world applications. Diverse real-world APIs demand precise, robust function-calling intelligence, which needs agents to develop these capabilities through interaction in varied environments. The breadth of function-calling competence is closely tied to the diversity of environments in which agents are trained. In this work, we scale up environments as a step towards advancing general agentic intelligence. This gives rise to two central challenges: (i) how to scale environments in a principled manner, and (ii) how to effectively train agentic capabilities from experiences derived through interactions with these environments. To address these, we design a scalable framework that automatically constructs heterogeneous environments that are fully simulated, systematically broadening the space of function-calling scenarios. We further adapt a two-phase agent fine-tuning strategy: first endowing agents with fundamental agentic capabilities, then specializing them for domain-specific contexts. Extensive experiments on agentic benchmarks, tau-bench, tau2-Bench, and ACEBench, demonstrate that our trained model, AgentScaler, significantly enhances the function-calling capability of models.

Runnan Fang, Shihao Cai, Baixuan Li, Jialong Wu, Guangyu Li, Wenbiao Yin, Xinyu Wang, Xiaobin Wang, Liangcai Su, Zhen Zhang, Shibin Wu, Zhengwei Tao, Yong Jiang, Pengjun Xie, Fei Huang, Jingren Zhou• 2025

Related benchmarks

Task	Dataset	Result
Interactive Tool-Use Agent Performance	tau2-Bench	Retail Performance Score70.2	102
Agent Performance	Tau-Bench	Retail Accuracy70.4	55
Interactive Tool-Use Agent Performance	VitaBench	Delivery Score25	44
Agentic Workflow Success	τ2-bench	Airline Success Rate60	43
Agent Performance	ACEBench Agent	Agent Score60	36

Showing 5 of 5 rows

Other info

Follow for update

@wizwand_team Discord