| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Terminal Agentic Trajectory Generation | TerminalBench 2.0 | Score57.8 | 29 | |
| Terminal Agentic Trajectory Generation | TerminalBench 1.0 | Score56.25 | 23 | |
| Agentic Coding | TerminalBench 2 | Pass Rate81.8 | 17 | |
| Code Generation | TerminalBench 2 | Pass@339.3 | 9 | |
| Agentic Coding | TerminalBench | Accuracy0.3375 | 7 | |
| Ranking Preservation | TerminalBench (test) | Mean Spearman Rho0.988 | 5 | |
| Terminal Agentic Trajectory Generation | TerminalBench | Pass@845 | 4 |