Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

MAZE

Benchmarks

Task NameDataset NameSOTA ResultTrend
Maze NavigationMaze (test)
Success Rate0.8
25
Maze NavigationMaze Hard
Accuracy97.66
14
Maze NavigationMaze (Standard)
Accuracy0.9961
14
Sequential PlanningMaze
Score (L=8)100
12
Planning10x10 Maze
Validity Rate57
12
PlanningMaze
Success Rate0.63
10
Goal-reachingmaze_large (test)
Success Rate70.5
10
Reinforcement LearningMaze 17^10 structured discrete
Mean Score9.61
9
Visual PlanningMAZE
EM74.5
8
Multimodal Maze SolvingMAZE
Pass@1 Accuracy84
8
Reinforcement LearningMaze 5^4 unstructured discrete
Mean Performance9.57
7
Reinforcement LearningMaze 5^4 structured discrete
Mean Score9.74
7
Problem Solving and Unsolvability DetectionMaze Hard
Solvable Accuracy98
7
Problem Solving and Unsolvability DetectionMaze Easy
Accuracy (Solvable)100
7
Hierarchical PlanningMaze
Token Cost3,518
6
Multi-Agent Task SchedulingMaze |A| = 400 (test)
Throughput2,417
4
Reinforcement LearningMaze 17^10 unstructured discrete
Mean Score9.25
4
POMDP Planningmaze-10 POMDP PRISM format (original enlarged)
Value (IQM)8.86
4
Spatial ReasoningMAZE
Pass@185.5
4
ExtrapolationMaze (24, 124) (test)
Accuracy100
4
Long-horizon predictionMedium Maze
NLL-0.88
4
Maze Path PlanningMaze 48x48
Validity89
3
Maze Path PlanningMaze 32x32
Validity88.6
3
Maze Path PlanningMaze 16x16
Validity88.6
3
Maze Path PlanningMaze 8x8
Validity94
3
Showing 25 of 33 rows