Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

AOrchestra: Automating Sub-Agent Creation for Agentic Orchestration

About

Language agents have shown strong promise for task automation. Realizing this promise for increasingly complex, long-horizon tasks has driven the rise of a sub-agent-as-tools paradigm for multi-turn task solving. However, existing designs still lack a dynamic abstraction view of sub-agents, thereby hurting adaptability. We address this challenge with a unified, framework-agnostic agent abstraction that models any agent as a tuple Instruction, Context, Tools, Model. This tuple acts as a compositional recipe for capabilities, enabling the system to spawn specialized executors for each task on demand. Building on this abstraction, we introduce an agentic system AOrchestra, where the central orchestrator concretizes the tuple at each step: it curates task-relevant context, selects tools and models, and delegates execution via on-the-fly automatic agent creation. Such designs enable reducing human engineering efforts, and remain framework-agnostic with plug-and-play support for diverse agents as task executors. It also enables a controllable performance-cost trade-off, allowing the system to approach Pareto-efficient. Across three challenging benchmarks (GAIA, SWE-Bench, Terminal-Bench), AOrchestra achieves 16.28% relative improvement against the strongest baseline when paired with Gemini-3-Flash. The code is available at: https://github.com/FoundationAgents/AOrchestra

Jianhao Ruan, Zhihao Xu, Yiran Peng, Fashen Ren, Zhaoyang Yu, Xinbing Liang, Jinyu Xiang, Yongru Chen, Bang Liu, Chenglin Wu, Yuyu Luo, Jiayi Zhang• 2026

Related benchmarks

TaskDatasetResultRank
Code GenerationMBPP
Pass@180.8
211
Code GenerationHumanEval
pass@184.2
145
Terminal task completionTerminal-bench 2.0
Pass@152.86
63
General AI Assistant TasksGAIA
Pass@1 Score69.4
38
Math problem solvingMath Macro-aggregate
Pass@159.3
22
Reading ComprehensionReading Macro-aggregate
Pass@168.2
22
Agentic Tool-useAgentic Macro-aggregate
Pass@157.9
22
Code and Software EngineeringCode/SE Macro-aggregate
Pass@163.5
22
Knowledge retrievalKnowledge Macro-aggregate
Pass@166
22
Software EngineeringSWE-bench Verified
Pass@182
18
Showing 10 of 22 rows

Other info

Follow for update