Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Step Completion on CoSPlan Shuffle-E
Loading...
0.301
Accuracy
GPT-4o
0.19596
0.22323
0.2505
0.27777
Dec 11, 2025
Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
GPT-4o
Reasoning Strategy=Sce...
2025.12
0.301
GPT-4o
Reasoning Strategy=Cha...
2025.12
0.276
CoG-VLM
Reasoning Strategy=Cha...
2025.12
0.271
Qwen2 VL-8B
Reasoning Strategy=Sce...
2025.12
0.251
Qwen2 VL-8B
Reasoning Strategy=Cha...
2025.12
0.249
Qwen2 VL-8B
Reasoning Strategy=Van...
2025.12
0.241
CoG-VLM
Reasoning Strategy=Sce...
2025.12
0.237
Janus-pro-7B
Reasoning Strategy=Sce...
2025.12
0.235
Intern-VLM
Reasoning Strategy=Sce...
2025.12
0.234
Janus-pro-7B
Reasoning Strategy=Van...
2025.12
0.232
Intern-VLM
Reasoning Strategy=Cha...
2025.12
0.232
CoG-VLM
Reasoning Strategy=Van...
2025.12
0.231
Janus-pro-7B
Reasoning Strategy=Cha...
2025.12
0.231
Intern-VLM
Reasoning Strategy=Van...
2025.12
0.201
Random
2025.12
0.2
Feedback
Search any
task
Search any
task