| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Classification | chess | Accuracy85.4 | 12 | |
| Tool Selection | Chess Skill: beginner, intermediate, advanced | Accuracy100 | 10 | |
| Tool Selection | Chess Specialists: opening, midgame, endgame, late-endgame | Accuracy64.4 | 10 | |
| Extrapolation | Chess (8, 8) (test) | Accuracy84.3 | 4 | |
| Board state reconstruction | Chess | Coverage98 | 3 | |
| Chess | Chess Match vs Stockfish 100 games | Win Rate25 | 2 | |
| Instruction Recovery | Chess O3 | Extracted Instructions126,516,521 | 1 | |
| Instruction Recovery | Chess O2 | Extracted Instructions Count1,905,184 | 1 | |
| Instruction Recovery | Chess O0 | Extracted Instructions Count8,568,798 | 1 |