Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

MAZE

Benchmarks

Task NameDataset NameSOTA ResultTrend
Maze NavigationMaze (test)
Success Rate0.8
25
Multi-Agent Path FindingMedium Maze 25x25 world size, 32.8% static obstacle rate
Success Rate100
20
Maze NavigationMaze Hard
Accuracy97.66
18
One-step next-observation predictionMaze (test)
Token F198
16
ReasoningMaze Hard
pass@1 Accuracy93.7
15
Maze NavigationMaze (Standard)
Accuracy0.9961
14
Sequential PlanningMaze
Score (L=8)100
12
Planning10x10 Maze
Validity Rate57
12
Video ReasoningMaze (test)
Precision82.1
11
Multi-Objective Reinforcement LearningMaze
Mean Episode Reward (MER)223.55
11
Spatial ReasoningMaze 10×10
CR (%)61.37
10
Logical ReasoningMaze
Pass@198
10
Video GenerationMaze
Maze Flow (Base)96.5
10
Visual ReasoningMaze
Accuracy (Scale 8)100
10
PlanningMaze
Success Rate0.63
10
Goal-reachingmaze_large (test)
Success Rate70.5
10
PathfindingMaze hard (test)
Accuracy85.3
9
Maze solvingMaze (test)
Accuracy99.9
9
Reinforcement LearningMaze 17^10 structured discrete
Mean Score9.61
9
Maze navigationMaze 100 held-out mazes
Best Success Rate @ 352.6
8
Maze SolvingMaze Hard
RSR71.8
8
Reinforcement LearningMaze
Mean Reward1.03
8
Convex Free-Space ApproximationMaze Line Seed
Time (ms)1.48
8
Convex Free-Space ApproximationMaze Point Seed
Time (ms)1.32
8
Visual PlanningMAZE
EM74.5
8
Showing 25 of 65 rows