Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Sokoban

Benchmarks

Task NameDataset NameSOTA ResultTrend
Visual Agentic ReasoningSokoban
Success Rate85
27
Spatial planningSokoban
Success Rate79
19
SokobanSokoban (Hard-2)
Accuracy87.89
14
SokobanSokoban (Hard-1)
Accuracy91.41
14
SokobanSokoban (Standard)
Accuracy98.83
14
Puzzle SolvingSokoban (Out Of Distribution)
Avg@12866.9
12
Puzzle SolvingSokoban (In Distribution)
Average Score @12889.7
12
Agentic ReasoningSokoban (test)
Success Rate38.3
12
SokobanSokoban (test)
Success Rate90.6
12
PlanningFull Sokoban
Validity Rate46
12
PlanningSokoban Grid
Validity Rate63
12
PlanningSokoban unseen problems
Completion Rate100
11
PlanningSokoban known optimal problems
Optimal Rate1
11
Video ReasoningSokoban (test)
Precision34
11
Interactive AgentSokoban
Pass@164.6
10
Sokoban PuzzleSokoban Symbol variant
Box Placement Score2
10
Sokoban PuzzleSokoban Action variant
Box Placement Score1.83
10
Sokoban PuzzleSokoban Base
Box Placement Score1.89
10
Multi-turn RL navigationSokoban held-out (val)
Success Rate52.3
10
PlanningSokoban
Completion Rate100
9
Puzzle SolvingSokoban
Success Rate43.6
8
Single-Agent Spatial PuzzlesSokoban (In-domain)
Success Rate77.3
8
Reinforcement LearningSokoban
Reward0.87
8
PlanningSokoban
p@142.4
8
SokobanSokoban 20 x 20, 4 boxes (test)
Success Rate77
8
Showing 25 of 52 rows