Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Agentic Reasoning on Terminal Bench Core 2.0
Loading...
37.5
Success Rate
Qwen3.5-122B-A10B
17.948
23.024
28.1
33.176
Apr 14, 2026
Success Rate
Updated 4d ago
Evaluation Results
Method
Method
Links
Success Rate
Qwen3.5-122B-A10B
2026.04
37.5
Nemotron 3 Super
2026.04
31
GPT-OSS-120B
2026.04
18.7
Feedback
Search any
task
Search any
task