Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Real World Visual Reasoning on MEGA
Loading...
54.7
Accuracy
Gemini 2.0 Flash
30.884
37.067
43.25
49.433
Feb 12, 2026
Accuracy
Updated 1mo ago
Evaluation Results
Method
Method
Links
Accuracy
Gemini 2.0 Flash
Zero-shot=true
2026.02
54.7
Qwen2.5-VL-32B + AT-RL (Ours)
Zero-shot=true
2026.02
53
OpenAI GPT-4o
Zero-shot=true
2026.02
52.7
Claude 3.5 Sonnet
Zero-shot=true
2026.02
52.5
Qwen2.5-VL-32B + VPPO
Zero-shot=true
2026.02
52.3
Qwen2.5-VL-72B Instruct
Zero-shot=true
2026.02
49.6
Qwen2.5-VL-32B Instruct
Zero-shot=true
2026.02
48.9
Llama 4 Scout
Zero-shot=true
2026.02
31.8
Feedback
Search any
task
Search any
task