Share your thoughts, 1 month free Claude Pro on usSee more

Agentic Reasoning on Terminal Bench hard

26.8Success Rate

Qwen3.5-122B-A10B

Updated 3mo ago

Evaluation Results

Method	Links
Qwen3.5-122B-A10B 2026.04		26.8
Nemotron 3 Super 2026.04		25.78
GPT-OSS-120B 2026.04		24