Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Skill evolution on Calendar merge
Loading...
77.7
Best@20
MaskClaw
73.815
75.7575
77.7
79.6425
May 27, 2026
Best@20
Base Accuracy
Evolved Accuracy
Base Unsafe Rate
Evolved Unsafe Rate
Compliance Rate
Updated 6d ago
Evaluation Results
Method
Method
Links
Best@20
Base Accuracy
Evolved Accuracy
Base Unsafe Rate
Evolved Unsafe Rate
Compliance Rate
MaskClaw
Tests=8
2026.05
77.7
12.5
100
87.5
0
95.65
Feedback
Search any
task
Search any
task