| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Grid-world Navigation | FrozenLake reward reversal Hidden drift | Score100 | 45 | |
| Grid World Navigation | FrozenLake (Source) | Score67.5 | 36 | |
| Gridworld Navigation | FrozenLake Source | Success Rate9,000 | 27 | |
| Grid-World Navigation | FrozenLake Drift II | Success Rate8,500 | 18 | |
| Grid Navigation | FrozenLake v1.0 (Drift I) | Success Rate7,500 | 18 | |
| Gridworld Navigation | FrozenLake Drift II - Topology Shift | Success Rate7,500 | 18 | |
| Gridworld Navigation | FrozenLake Drift I - Topology Shift | Success Rate8,500 | 18 | |
| Grid-world Navigation | FrozenLake reward reversal Source | Score90 | 18 | |
| Agent Task | FrozenLake | Success Rate100 | 17 | |
| Agent Behavior Adaptation | FrozenLake (FL) (test) | Loop Ratio0 | 17 | |
| Grid-world Navigation | FrozenLake Implicit Hidden Drift | Success Rate (Source)88.8 | 14 | |
| Grid-world Navigation | FrozenLake Explicit Structural Drift II | Success Rate (Source)83.7 | 14 | |
| Multi-turn RL navigation | FrozenLake held-out (val) | Success Rate60.7 | 10 | |
| Navigation | FrozenLake Topology Explicit structural drift (II) | Success Rate85 | 9 | |
| Navigation | FrozenLake Drift I Topology Explicit structural drift | Success Rate0.85 | 9 | |
| Navigation | FrozenLake Explicit structural drift (Source) | Success Rate85 | 9 | |
| Navigation | FrozenLake topology shift baseline (Source) | Success Rate100 | 9 | |
| Grid World Navigation | FrozenLake Drift I | Success Rate85 | 9 | |
| Grid Navigation | FrozenLake Drift II v1.0 | Success Rate75 | 9 | |
| Grid Navigation | FrozenLake Source v1.0 | Success Rate100 | 9 | |
| Grid World Navigation | FrozenLake Hidden drift | Score100 | 9 | |
| Grid World Navigation | FrozenLake reward reversal Implicit Drift Hidden drift | Score100 | 9 | |
| Grid World Navigation | FrozenLake reward reversal Implicit Drift (Source) | Score45 | 9 | |
| Visual Planning | FROZENLAKE | EM (%)91.6 | 8 | |
| Reinforcement Learning | FrozenLake | Reward0.94 | 8 |