Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Cybersecurity Challenge Solving on Cybench 33-challenge
Loading...
15
Solved Count
CSI::Claude
6.68
8.84
11
13.16
May 27, 2026
Solved Count
Success Rate
Wall Time (h)
Total Cost ($)
Cost per Solve ($)
Commands Executed
Error Count
Input Tokens (B)
Output Tokens (M)
Updated 6d ago
Evaluation Results
Method
Method
Links
Solved Count
Success Rate
Wall Time (h)
Total Cost ($)
Cost per Solve ($)
Commands Executed
Error Count
Input Tokens (B)
Output Tokens (M)
CSI::Claude
Scaffold=Claude
2026.05
15
45.5
26.8
5,122
341
8,370
2,554
0.001
14.6
CSI::Codex
Scaffold=Codex
2026.05
15
45.5
18.4
1,713
114
3,437
287
0.339
3.9
CSI::GCAI
Scaffold=GCAI
2026.05
10
30.3
30.4
1,279
128
9,734
809
0.294
10.3
CSI::CAI
Scaffold=CAI
2026.05
7
21.2
15.9
727
104
386
454
0.159
1.1
Feedback
Search any
task
Search any
task