Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

SkillOrchestra: Learning to Route Agents via Skill Transfer

About

Compound AI systems promise capabilities beyond those of individual models, yet their success depends critically on effective orchestration. Existing routing approaches face two limitations: (1) input-level routers make coarse query-level decisions that ignore evolving task requirements; (2) RL-trained orchestrators are expensive to adapt and often suffer from routing collapse, repeatedly invoking one strong but costly option in multi-turn scenarios. We introduce SkillOrchestra, a framework for skill-aware orchestration. Instead of directly learning a routing policy end-to-end, SkillOrchestra learns fine-grained skills from execution experience and models agent-specific competence and cost under those skills. At deployment, the orchestrator infers the skill demands of the current interaction and selects agents that best satisfy them under an explicit performance-cost trade-off. Extensive experiments across ten benchmarks demonstrate that SkillOrchestra outperforms SoTA RL-based orchestrators by up to 22.5% with 700x and 300x learning cost reduction compared to Router-R1 and ToolOrchestra, respectively. These results show that explicit skill modeling enables scalable, interpretable, and sample-efficient orchestration, offering a principled alternative to data-intensive RL-based approaches. The code is available at: https://github.com/jiayuww/SkillOrchestra.

Jiayu Wang, Yifei Ming, Zixuan Ke, Shafiq Joty, Aws Albarghouthi, Frederic Sala• 2026

Related benchmarks

TaskDatasetResultRank
Multi-hop Question AnsweringMuSiQue--
106
Multi-hop Question AnsweringBamboogle
Accuracy63.2
52
Multi-hop Question Answering2Wiki--
41
General Question AnsweringNQ
Exact Match (EM)54.8
36
General Question AnsweringTriviaQA
Accuracy80.2
18
General Question AnsweringPopQA
Accuracy48.8
18
Multi-hop Question AnsweringHotpotQA
Accuracy44.2
18
Showing 7 of 7 rows

Other info

GitHub

Follow for update