| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Offline Reinforcement Learning | puzzle-3x3-play OGBench 5 tasks v0 | Average Success Rate87 | 19 | |
| Offline Goal-Conditioned Reinforcement Learning | puzzle 3x3 play v0 | Success Rate100 | 13 | |
| overall | puzzle 4x6 | Success Rate5,100 | 10 | |
| task5 | puzzle-4x6 | Success Rate0 | 10 | |
| task4 | puzzle 4x6 | Success Rate23 | 10 | |
| task3 | puzzle-4x6 | Success Rate6,700 | 10 | |
| task2 | puzzle 4x6 | Success Rate66 | 10 | |
| task1 | puzzle 4x6 | Success Rate10,000 | 10 | |
| overall | puzzle 4x5 | Success Rate9,700 | 10 | |
| task5 | puzzle 4x5 | Success Rate8,800 | 10 | |
| task4 | puzzle 4x5 | Success Rate99 | 10 | |
| task3 | puzzle 4x5 | Success Rate100 | 10 | |
| task2 | puzzle 4x5 | Success Rate9,900 | 10 | |
| task1 | puzzle 4x5 | Success Rate10,000 | 10 | |
| Offline Goal-Conditioned Reinforcement Learning | puzzle-4x6-1B | Success Rate9,100 | 10 | |
| Offline Goal-Conditioned Reinforcement Learning | puzzle 4x5 | Success Rate9,600 | 10 | |
| Manipulation | puzzle 4x4-play-oraclerep v0 | Task 1 Success Rate62 | 9 | |
| Manipulation | puzzle 3x3 play oraclerep v0 | Task 1 Success Rate99 | 9 | |
| Offline Goal-conditioned Reinforcement Learning | puzzle 3x3-play-oraclerep v0 | task199 | 9 | |
| Puzzle Solving | Puzzle 10,000 puzzles (test) | Puzzle Success Rate97.2 | 7 | |
| Logic Reasoning | Puzzle 1.0 (test) | F1 Score19 | 7 | |
| Charts, figures, and puzzles | Puzzle | Score77.35 | 6 | |
| Offline Goal-Conditioned Reinforcement Learning | puzzle 4x6-play v0 | Average Binary Success Rate18 | 6 | |
| Offline Goal-Conditioned Reinforcement Learning | puzzle 4x5 play v0 | Success Rate18 | 6 | |
| Offline Goal-Conditioned Reinforcement Learning | puzzle 4x4 play v0 | Average Binary Success Rate88 | 6 |