Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

FrozenLake

Benchmarks

Task NameDataset NameSOTA ResultTrend
Grid-world NavigationFrozenLake reward reversal Hidden drift
Score100
45
Grid World NavigationFrozenLake (Source)
Score67.5
36
Gridworld NavigationFrozenLake Source
Success Rate9,000
27
Grid-World NavigationFrozenLake Drift II
Success Rate8,500
18
Grid NavigationFrozenLake v1.0 (Drift I)
Success Rate7,500
18
Gridworld NavigationFrozenLake Drift II - Topology Shift
Success Rate7,500
18
Gridworld NavigationFrozenLake Drift I - Topology Shift
Success Rate8,500
18
Grid-world NavigationFrozenLake reward reversal Source
Score90
18
Puzzle ReasoningFrozenLake
Success Rate82
17
Agent TaskFrozenLake
Success Rate100
17
Agent Behavior AdaptationFrozenLake (FL) (test)
Loop Ratio0
17
2D discrete grid-world planningFrozenLake
Success Rate83
15
Grid-world NavigationFrozenLake Implicit Hidden Drift
Success Rate (Source)88.8
14
Grid-world NavigationFrozenLake Explicit Structural Drift II
Success Rate (Source)83.7
14
NavigationFrozenLake
Success Rate27
12
Reinforcement LearningFrozenLake
Reward0.94
12
Multi-turn RL navigationFrozenLake held-out (val)
Success Rate60.7
10
NavigationFrozenLake Topology Explicit structural drift (II)
Success Rate85
9
NavigationFrozenLake Drift I Topology Explicit structural drift
Success Rate0.85
9
NavigationFrozenLake Explicit structural drift (Source)
Success Rate85
9
NavigationFrozenLake topology shift baseline (Source)
Success Rate100
9
Grid World NavigationFrozenLake Drift I
Success Rate85
9
Grid NavigationFrozenLake Drift II v1.0
Success Rate75
9
Grid NavigationFrozenLake Source v1.0
Success Rate100
9
Grid World NavigationFrozenLake Hidden drift
Score100
9
Showing 25 of 41 rows