Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Security Analysis on CyberGym
Loading...
60.2
Resolved Percentage
SageAgent
27.648
36.099
44.55
53.001
Feb 18, 2026
Resolved Percentage
Updated 4d ago
Evaluation Results
Method
Method
Links
Resolved Percentage
SageAgent
Model=GPT-5 (medium),...
2026.02
60.2
Anthropic Agent
Model=Claude Opus 4.5,...
2026.02
50.6
OpenHands
Model=GPT-5 (high), AD...
2026.02
39.4
Anthropic Agent
Model=Claude Sonnet 4....
2026.02
28.9
Feedback
Search any
task
Search any
task