Share your thoughts, 1 month free Claude Pro on usSee more

Agentic Task Completion on Terminal-Bench All 2

77Pass@1

AHE

Updated 2mo ago

Evaluation Results

Method	Links
AHE 2026.04		77
TF-GRPO 2026.04		72.3
Codex 2026.04		71.9
NexAU0 2026.04		69.7
ACE 2026.04		68.9
terminus-2 2026.04		62.9
opencode 2026.04		47.2