Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Maze Navigation on Maze Hard
Loading...
97.66
Accuracy
GPT-5
-3.9064
22.4618
48.83
75.1982
Nov 28, 2025
Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
GPT-5
category=Proprietary M...
2025.11
97.66
OpenAI o3
category=Proprietary M...
2025.11
93.36
OpenAI o4-mini
category=Proprietary M...
2025.11
78.52
Claude 4.5 Sonnet
category=Proprietary M...
2025.11
68.36
Gemini 2.5 Pro
category=Proprietary M...
2025.11
63.28
WMAct
2025.11
50.59
PPO - Interactive
mode=interactive
2025.11
36.52
Qwen3-14B
category=Opensource Mo...
2025.11
28.52
PPO - EntirePlan
mode=single-turn output
2025.11
26.51
Qwen3-8B
category=Opensource Mo...
2025.11
17.76
GPT-4o
category=Proprietary M...
2025.11
1.56
Qwen2.5-32B-Instruct
category=Opensource Mo...
2025.11
0.39
Qwen3-8B-Own
backbone=Qwen3-8B
2025.11
0.2
Qwen2.5-7B-Instruct
category=Opensource Mo...
2025.11
0
Feedback
Search any
task
Search any
task