Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Agentic Skill Execution on SkillsBench Claude Code CLI
Loading...
9
Pass Count
SkCC
7.96
8.23
8.5
8.77
May 5, 2026
Pass Count
Pass Rate
Mean Reward
Pass Rate (Baseline)
Pass Rate (Optimized)
Delta (pp)
Updated 28d ago
Evaluation Results
Method
Method
Links
Pass Count
Pass Rate
Mean Reward
Pass Rate (Baseline)
Pass Rate (Optimized)
Delta (pp)
SkCC
Condition=Compiled, Ta...
2026.05
9
33.3
0.378
-
-
-
Claude (Original)
Condition=Original, Ta...
2026.05
8
21.1
0.245
-
-
-
SkCC
Model=Claude
2026.05
-
-
-
21.1
33.3
12.2
Liu et al.
Model=Claude
2026.05
-
-
-
40.1
48.2
8.1
Feedback
Search any
task
Search any
task