Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Sokoban on Sokoban (Standard)
Loading...
98.83
Accuracy
OpenAI o3
-3.9532
22.7309
49.415
76.0991
Nov 28, 2025
Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
OpenAI o3
category=Proprietary M...
2025.11
98.83
GPT-5
category=Proprietary M...
2025.11
96.88
OpenAI o4-mini
category=Proprietary M...
2025.11
83.2
WMAct
2025.11
78.57
PPO - Interactive
mode=interactive
2025.11
64.21
Claude 4.5 Sonnet
category=Proprietary M...
2025.11
60.94
PPO - EntirePlan
mode=single-turn output
2025.11
49.12
Gemini 2.5 Pro
category=Proprietary M...
2025.11
36.72
Qwen3-14B
category=Opensource Mo...
2025.11
18.75
Qwen3-8B
category=Opensource Mo...
2025.11
16.41
Qwen3-8B-Own
backbone=Qwen3-8B
2025.11
3.29
GPT-4o
category=Proprietary M...
2025.11
1.95
Qwen2.5-7B-Instruct
category=Opensource Mo...
2025.11
0
Qwen2.5-32B-Instruct
category=Opensource Mo...
2025.11
0
Feedback
Search any
task
Search any
task