Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

TerminalBench

Benchmarks

Task NameDataset NameSOTA ResultTrend
Terminal Agentic Trajectory GenerationTerminalBench 2.0
Score57.8
29
Terminal Agentic Trajectory GenerationTerminalBench 1.0
Score56.25
23
Agentic CodingTerminalBench
Accuracy0.3375
4
Terminal Agentic Trajectory GenerationTerminalBench
Pass@845
4
Showing 4 of 4 rows