Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Sokoban

Benchmarks

Task NameDataset NameSOTA ResultTrend
SokobanSokoban (Hard-2)
Accuracy87.89
14
SokobanSokoban (Hard-1)
Accuracy91.41
14
SokobanSokoban (Standard)
Accuracy98.83
14
Puzzle SolvingSokoban (Out Of Distribution)
Avg@12866.9
12
Puzzle SolvingSokoban (In Distribution)
Average Score @12889.7
12
Agentic ReasoningSokoban (test)
Success Rate38.3
12
SokobanSokoban (test)
Success Rate90.6
12
PlanningFull Sokoban
Validity Rate46
12
PlanningSokoban Grid
Validity Rate63
12
Visual Agentic ReasoningSokoban
Score3
10
Interactive AgentSokoban
Pass@164.6
10
Sokoban PuzzleSokoban Symbol variant
Box Placement Score2
10
Sokoban PuzzleSokoban Action variant
Box Placement Score1.83
10
Sokoban PuzzleSokoban Base
Box Placement Score1.89
10
Multi-turn RL navigationSokoban held-out (val)
Success Rate52.3
10
Single-Agent Spatial PuzzlesSokoban (In-domain)
Success Rate77.3
8
Reinforcement LearningSokoban
Reward0.87
8
PlanningSokoban
p@142.4
8
SokobanSokoban 20 x 20, 4 boxes (test)
Success Rate77
8
SokobanSokoban 16 x 16, 4 boxes (test)
Success Rate85
8
SokobanSokoban 12 x 12, 4 boxes (test)
Success Rate93
8
Hierarchical PlanningSokoban
Token Cost2,608
6
Multi-turn RL Task CompletionSokoban
Success Rate38.3
6
Planning and puzzle solvingSokoban
Accuracy62.5
5
Puzzle SolvingSokoban Jr_1 Levels 1.0
Solve Rate49
5
Showing 25 of 28 rows