Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Maze Navigation on MAZENAVIGATION 7x7 Long OOD Both 1.0
Loading...
4,000
EM
Wan2.2-TI2V-5B
-160
920
2,000
3,080
Jan 28, 2026
EM
PR
Updated 4d ago
Evaluation Results
Method
Method
Links
EM
PR
Wan2.2-TI2V-5B
Input=Text + Image, Ou...
2026.01
4,000
5,110
Wan2.2-TI2V-5B (Unseen Visual Icons)
Input=Text + Image, Ou...
2026.01
3,800
4,790
GPT-5.1
Input=Text + Image, Ou...
2026.01
0
0
GPT-5.2
Input=Text + Image, Ou...
2026.01
0
0
Qwen3-VL-8B
Input=Text + Image, Ou...
2026.01
0
1,130
Qwen3-VL-8B (w/ coordinates)
Input=Text + Image, Ou...
2026.01
0
810
VPRL-7B *
Input=Image, Output=Im...
2026.01
0
410
Feedback
Search any
task
Search any
task