| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| VR-Bench | ChEaP | Success Rate (pass@2, Easy)72 | 10 | 18d ago | |
| Frozen Lake | EPBS | Success Rate (pass@2, 4x4)98.7 | 10 | 18d ago | |
| PerfectMaze XL (held-out) | ADD | Solved Rate14 | 9 | 1mo ago | |
| PerfectMaze Large (held-out) | TRACED | Solved Rate27 | 9 | 1mo ago | |
| Mazes-25 | LocRNN (ACT) | Accuracy49.99 | 7 | 1mo ago | |
| Mazes-19 | LocRNN (ACT) | Accuracy (Mazes-19)86.83 | 7 | 1mo ago | |
| Maze (test) | TRM | Accuracy85.3 | 4 | 24d ago | |
| Maze | McVAMP | Mean Path Length0.032 | 3 | 3d ago | |
| Maze-hard (test) | SE-RRM | FSR88.8 | 3 | 1mo ago | |
| Maze | CMM | Accuracy82.2 | 1 | 24d ago | |
| Mazes 25x25 (test) | - | Accuracy- | 0 | 1mo ago | |
| Mazes 19x19 (test) | - | Accuracy- | 0 | 1mo ago |