Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Agent Capability Evaluation on SEAL 0

53.4Score

Claude-4.5-Sonnet

35.239.92544.6549.375Feb 6, 2026
Updated 4d ago

Evaluation Results

MethodLinks
2026.02
53.4
2026.02
51.4
2026.02
48.6
2026.02
40.5
2026.02
40.4
2026.02
39.6
2026.02
38.5
2026.02
36
2026.02
35.9