Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Overall Evaluation on Demo-ICL-Bench
Loading...
80.1
Average Score
Human
19.676
35.363
51.05
66.737
Feb 9, 2026
Average Score
Updated 4d ago
Evaluation Results
Method
Method
Links
Average Score
Human
2026.02
80.1
Gemini-2.5-Pro
2026.02
38.9
GPT-4o
2026.02
34.9
Demo-ICL
Size=7B, Frame=32
2026.02
33.1
Demo-ICL (SFT)
Size=7B, Frame=32
2026.02
29.8
Qwen2.5-VL
Size=72B, Frame=32
2026.02
29.5
LLaVA-Video
Size=7B, Frame=32
2026.02
27.2
VideoChat-R1
Size=7B, Frame=32
2026.02
27
Qwen2.5-VL
Size=7B, Frame=32
2026.02
26.3
Video-R1
Size=7B, Frame=32
2026.02
26.2
InternVL-3
Size=8B, Frame=32
2026.02
25
Ola-Video (Base)
Size=7B, Frame=32
2026.02
24.8
Ola
Size=7B, Frame=32
2026.02
24.3
Qwen2-VL
Size=7B, Frame=32
2026.02
22
Feedback
Search any
task
Search any
task