Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Maze Navigation on Maze (Standard)
Loading...
0.9961
Accuracy
GPT-5
-0.039844
0.229103
0.49805
0.766997
Nov 28, 2025
Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
GPT-5
category=Proprietary M...
2025.11
0.9961
OpenAI o3
category=Proprietary M...
2025.11
0.957
Claude 4.5 Sonnet
category=Proprietary M...
2025.11
0.9414
OpenAI o4-mini
category=Proprietary M...
2025.11
0.9375
WMAct
2025.11
0.8814
PPO - Interactive
mode=interactive
2025.11
0.8374
Gemini 2.5 Pro
category=Proprietary M...
2025.11
0.8359
PPO - EntirePlan
mode=single-turn output
2025.11
0.7504
Qwen3-14B
category=Opensource Mo...
2025.11
0.7227
Qwen3-8B
category=Opensource Mo...
2025.11
0.6367
GPT-4o
category=Proprietary M...
2025.11
0.0469
Qwen3-8B-Own
backbone=Qwen3-8B
2025.11
0.0195
Qwen2.5-32B-Instruct
category=Opensource Mo...
2025.11
0.0068
Qwen2.5-7B-Instruct
category=Opensource Mo...
2025.11
0
Feedback
Search any
task
Search any
task