AOrchestra: Automating Sub-Agent Creation for Agentic Orchestration

About

Language agents have shown strong promise for task automation. Realizing this promise for increasingly complex, long-horizon tasks has driven the rise of a sub-agent-as-tools paradigm for multi-turn task solving. However, existing designs still lack a dynamic abstraction view of sub-agents, thereby hurting adaptability. We address this challenge with a unified, framework-agnostic agent abstraction that models any agent as a tuple Instruction, Context, Tools, Model. This tuple acts as a compositional recipe for capabilities, enabling the system to spawn specialized executors for each task on demand. Building on this abstraction, we introduce an agentic system AOrchestra, where the central orchestrator concretizes the tuple at each step: it curates task-relevant context, selects tools and models, and delegates execution via on-the-fly automatic agent creation. Such designs enable reducing human engineering efforts, and remain framework-agnostic with plug-and-play support for diverse agents as task executors. It also enables a controllable performance-cost trade-off, allowing the system to approach Pareto-efficient. Across three challenging benchmarks (GAIA, SWE-Bench, Terminal-Bench), AOrchestra achieves 16.28% relative improvement against the strongest baseline when paired with Gemini-3-Flash. The code is available at: https://github.com/FoundationAgents/AOrchestra

Jianhao Ruan, Zhihao Xu, Yiran Peng, Fashen Ren, Zhaoyang Yu, Xinbing Liang, Jinyu Xiang, Yongru Chen, Bang Liu, Chenglin Wu, Yuyu Luo, Jiayi Zhang• 2026

Related benchmarks

Task	Dataset	Result
Code Generation	MBPP	Pass@180.8	211
Code Generation	HumanEval	pass@184.2	145
Terminal task completion	Terminal-bench 2.0	Pass@152.86	63
General AI Assistant Tasks	GAIA	Pass@1 Score69.4	38
Math problem solving	Math Macro-aggregate	Pass@159.3	22
Reading Comprehension	Reading Macro-aggregate	Pass@168.2	22
Agentic Tool-use	Agentic Macro-aggregate	Pass@157.9	22
Code and Software Engineering	Code/SE Macro-aggregate	Pass@163.5	22
Knowledge retrieval	Knowledge Macro-aggregate	Pass@166	22
Software Engineering	SWE-bench Verified	Pass@182	18

Showing 10 of 22 rows

Other info

Follow for update

@wizwand_team Discord