Graph-of-Skills: Dependency-Aware Structural Retrieval for Massive Agent Skills
About
Modern LLM agents increasingly rely on reusable skills, and as they interact with personal applications, web browsers, and other interfaces, skill libraries can scale to thousands of skills. Scaling to larger skill sets introduces two key challenges. First, loading the full skill set saturates the context window, driving up token costs, hallucination, and latency. Second, semantic retrieval surfaces topically relevant skills but misses their prerequisite chain of upstream and downstream skills, creating a prerequisite gap that leaves the retrieved bundle execution-incomplete. In this paper, we present Graph-of-Skills (GoS), an inference-time structural retrieval layer for large skill libraries. GoS constructs an executable skill graph offline from skill packages, then at inference time retrieves a bounded, dependency-aware skill bundle through hybrid semantic-lexical seeding, reverse-aware Personalized PageRank, and context-budgeted hydration. On SkillsBench and ALFWorld, GoS consistently delivers substantial reward improvements and token savings across three model families (Claude Sonnet 4.5, MiniMax M2.7, and GPT-5.2 Codex). On SkillsBench, GoS achieves a peak reward increase of 25.55% while reducing total tokens by 56.72% over the vanilla full skill-loading baseline using GPT-5.2 Codex. Ablations confirm this pattern across skill libraries from 200 to 2,000 skills.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Agent Task | AlfWorld | Success Rate75.4 | 40 | |
| Sequential embodied environment task | ALFWorld 140-episode | Average Reward (%)97.9 | 9 | |
| Agent Task Completion | tau2-bench Airline | Pass Rate56.1 | 9 | |
| Agent Task Completion | tau2-bench, SkillsBench, and ALFWorld Average | Average Pass Rate53.6 | 9 | |
| Agent Task Success | tau2-bench Retail Domain | Total Pass Rate60 | 9 | |
| Agent Task Completion | tau2-Bench Telecom | Pass Rate60.4 | 9 | |
| Agent Task Completion | SkillsBench | Pass Rate15.9 | 9 |