Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

MAZE

Benchmarks

Task NameDataset NameSOTA ResultTrend
Maze NavigationMaze (test)
Success Rate0.8
25
Maze NavigationMaze Hard
Accuracy97.66
14
Maze NavigationMaze (Standard)
Accuracy0.9961
14
Sequential PlanningMaze
Score (L=8)100
12
Planning10x10 Maze
Validity Rate57
12
Multi-Objective Reinforcement LearningMaze
Mean Episode Reward (MER)223.55
11
Video GenerationMaze
Maze Flow (Base)96.5
10
Visual ReasoningMaze
Accuracy (Scale 8)100
10
PlanningMaze
Success Rate0.63
10
Goal-reachingmaze_large (test)
Success Rate70.5
10
Reinforcement LearningMaze 17^10 structured discrete
Mean Score9.61
9
Visual PlanningMAZE
EM74.5
8
Multimodal Maze SolvingMAZE
Pass@1 Accuracy84
8
Multi-agent coordinationMaze Structure Map
Final Cumulative Win Rate (FW)81.15
7
Reinforcement LearningMaze 5^4 unstructured discrete
Mean Performance9.57
7
Reinforcement LearningMaze 5^4 structured discrete
Mean Score9.74
7
Problem Solving and Unsolvability DetectionMaze Hard
Solvable Accuracy98
7
Problem Solving and Unsolvability DetectionMaze Easy
Accuracy (Solvable)100
7
ReasoningMaze Hard
pass@1 Accuracy93.7
6
Multi-Agent Path Findingmaze-32-32-4 (# agents: 30)
UA Conflicts4.75
6
Multi-Agent Path Findingmaze 32-32-2 (# agents: 30)
UA Conflicts7.59
6
Hierarchical PlanningMaze
Token Cost3,518
6
Maze solvingMaze (test)
Accuracy85.3
4
Safe NavigationMaze 2
Success Rate (SR)100
4
Safe NavigationMaze 1
Success Rate (SR)100
4
Showing 25 of 50 rows