SkillRL: Evolving Agents via Recursive Skill-Augmented Reinforcement Learning
About
Large Language Model (LLM) agents have shown stunning results in complex tasks, yet they often operate in isolation, failing to learn from past experiences. Existing memory-based methods primarily store raw trajectories, which are often redundant and noise-heavy. This prevents agents from extracting high-level, reusable behavioral patterns that are essential for generalization. In this paper, we propose SkillRL, a framework that bridges the gap between raw experience and policy improvement through automatic skill discovery and recursive evolution. Our approach introduces an experience-based distillation mechanism to build a hierarchical skill library SkillBank, an adaptive retrieval strategy for general and task-specific heuristics, and a recursive evolution mechanism that allows the skill library to co-evolve with the agent's policy during reinforcement learning. These innovations significantly reduce the token footprint while enhancing reasoning utility. Experimental results on ALFWorld, WebShop and seven search-augmented tasks demonstrate that SkillRL achieves state-of-the-art performance, outperforming strong baselines over 15.3% and maintaining robustness as task complexity increases. Code is available at this https://github.com/aiming-lab/SkillRL.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Interactive Decision-making | AlfWorld | Overall Success Rate89.9 | 118 | |
| Single-hop Question Answering | PopQA | -- | 104 | |
| Embodied Task | AlfWorld | Overall Success Rate89.9 | 96 | |
| Single-hop Question Answering | TriviaQA | -- | 81 | |
| Multi-hop QA | HotpotQA | -- | 76 | |
| Multi-hop QA | MuSiQue | EM20.2 | 65 | |
| Interactive web-based shopping tasks | Webshop | Score85.2 | 60 | |
| Multi-hop Question Answering | Multi-Hop QA (HotpotQA, 2Wiki, Musique, Bamboogle) | HotpotQA Score45.9 | 48 | |
| Question Answering | Search-QA | HotpotQA Score43.2 | 46 | |
| Web-based Agent Interaction | WebShop (val) | Success Rate72.7 | 31 |