Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Sokoban on Sokoban
Loading...
55
Success Rate
Gemini 3.1 Pro
-2.2
12.65
27.5
42.35
May 11, 2026
Success Rate
Average Actions
Updated 22d ago
Evaluation Results
Method
Method
Links
Success Rate
Average Actions
Gemini 3.1 Pro
Group=Closed (zero-shot)
2026.05
55
21
Adaptive (Qwen2.5-VL-7B)
Group=Ours
2026.05
35.9
30
GPT-5.5
Group=Closed (zero-shot)
2026.05
35
22.1
Claude Sonnet
Group=Closed (zero-shot)
2026.05
0
-
InternVL3-8B/14B/78B
Group=Open (zero-shot)
2026.05
0
-
Qwen2.5-VL-7B/72B
Group=Open (zero-shot)
2026.05
0
-
Qwen3-VL-8B/32B
Group=Open (zero-shot)
2026.05
0
-
Feedback
Search any
task
Search any
task