Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

SkillRevise: Improving LLM-Authored Agent Skills via Trace-Conditioned Skill Revision

About

Agent skills are procedural artifacts that enable LLM agents to execute workflows, verify constraints, and recover from failures. Existing self-evolving methods refine skills using accumulated trajectories. However, they struggle in cold-start settings, where only an initial, imperfect skill is available. Consequently, skill construction defaults to expert authoring or one-shot LLM generation. Expert-authored skills are costly and may not align with how LLM agents actually execute tasks, while one-shot generated skills can be syntactically well formed yet behaviorally weak. To bridge this gap, we propose SkillRevise, an execution-grounded framework designed to iteratively refine these initial skills. SkillRevise diagnoses skill defects from execution evidence, retrieves relevant repair principles from a general memory, and applies execution-anchored edits. By re-executing candidates and measuring empirical utility, it systematically retains the optimal skill version. Evaluated across three benchmarks and five LLMs, SkillRevise substantially outperforms one-shot baselines, improving the base agent's success rate on SkillsBench from 36.05% to 61.63%. Furthermore, the revised skills exhibit strong cross-model transferability, capturing generalized procedural knowledge over model-specific artifacts.

Yuxuan Liu, Zhaochen Su, Lingyun Xie, Yuhao Zhang, Qing Zong, Jiahe Guo, Zhongwei Xie, Yiyan Ji, Yauwai Yim, Hongyu Luo, Xiyu Ren, Ruan Chenyu, Haoran Li, Yangqiu Song• 2026

Related benchmarks

TaskDatasetResultRank
Skill executionSkillsBench
Overall Success Rate (avg@5)53
26
Software Engineering Task SuccessSWE-Skills-Bench Hard
Task Success Rate35
20
Task successSkillLearnBench Random
Success Count29
20
Interactive Task CompletionALFWorld cleaned 100-task v3 (mix of val-seen and val-unseen)
Success Rate71
12
Showing 4 of 4 rows

Other info

Follow for update