| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Sokoban | Sokoban (Hard-2) | Accuracy87.89 | 14 | |
| Sokoban | Sokoban (Hard-1) | Accuracy91.41 | 14 | |
| Sokoban | Sokoban (Standard) | Accuracy98.83 | 14 | |
| Planning | Full Sokoban | Validity Rate46 | 12 | |
| Planning | Sokoban Grid | Validity Rate63 | 12 | |
| Sokoban Puzzle | Sokoban Symbol variant | Box Placement Score2 | 10 | |
| Sokoban Puzzle | Sokoban Action variant | Box Placement Score1.83 | 10 | |
| Sokoban Puzzle | Sokoban Base | Box Placement Score1.89 | 10 | |
| Multi-turn RL navigation | Sokoban held-out (val) | Success Rate52.3 | 10 | |
| Reinforcement Learning | Sokoban | Reward0.87 | 8 | |
| Planning | Sokoban | p@142.4 | 8 | |
| Sokoban | Sokoban 20 x 20, 4 boxes (test) | Success Rate77 | 8 | |
| Sokoban | Sokoban 16 x 16, 4 boxes (test) | Success Rate85 | 8 | |
| Sokoban | Sokoban 12 x 12, 4 boxes (test) | Success Rate93 | 8 | |
| Hierarchical Planning | Sokoban | Token Cost2,608 | 6 | |
| Multi-turn RL Task Completion | Sokoban | Success Rate38.3 | 6 | |
| Puzzle Solving | Sokoban Jr_1 Levels 1.0 | Solve Rate49 | 5 | |
| Puzzle Solving | Sokoban (test) | Average Reward3.74 | 4 | |
| Text Game | Sokoban (test) | Accuracy53.9 | 4 | |
| Sokoban Box Pushing | Sokoban | Box 1 Success Rate98 | 4 |