Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Long-context understanding and resolution on LongBench v2 (test)
Loading...
135
Resolution Success Rate
Codex CLI
121.48
124.99
128.5
132.01
May 7, 2026
Resolution Success Rate
Cost ($)
P-Value
Total Tokens (M)
Updated 23d ago
Evaluation Results
Method
Method
Links
Resolution Success Rate
Cost ($)
P-Value
Total Tokens (M)
Codex CLI
System=Codex CLI
2026.05
135
25.58
0.041
35.19
Spell
System=Spell
2026.05
122
27.83
0.041
10.36
Feedback
Search any
task
Search any
task