Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Multi-turn RL Task Completion on Sokoban
Loading...
38.3
Success Rate
Qwen2.5-0.5B (TSR Beam Search)
18.7792
23.8471
28.915
33.9829
Feb 12, 2026
Success Rate
Updated 4d ago
Evaluation Results
Method
Method
Links
Success Rate
Qwen2.5-0.5B (TSR Beam Search)
Model=Qwen2.5-0.5B, Tr...
2026.02
38.3
Qwen2.5-0.5B (TSR Lookahead)
Model=Qwen2.5-0.5B, Tr...
2026.02
36.1
Qwen2.5-0.5B (TSR Best-of-N)
Model=Qwen2.5-0.5B, Tr...
2026.02
33.3
Qwen2.5-0.5B (Instance Filtering)
Model=Qwen2.5-0.5B, Tr...
2026.02
29
GPT-4o
Protocol=zero-shot
2026.02
27.73
Qwen2.5-72B
Protocol=zero-shot
2026.02
19.53
Feedback
Search any
task
Search any
task