Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Vision-Language Reasoning on UHR-Micro (test)
Loading...
44.1
Average Score
MAP-Agent
7.44
16.9575
26.475
35.9925
May 12, 2026
Average Score
Updated 21d ago
Evaluation Results
Method
Method
Links
Average Score
MAP-Agent
base_model=Qwen3-VL-8B
2026.05
44.1
Qwen3-VL-235B
2026.05
32.83
Qwen3-VL-30B
2026.05
31.51
MAP-Agent
base_model=InternVL3.5-8B
2026.05
28.75
Qwen3-VL-8B
2026.05
27.99
Gemini-2.5-Pro
2026.05
27.46
DeepEyes
2026.05
26.01
Qwen2.5-VL-7B
2026.05
24.78
GPT-5.2
2026.05
24.76
InternVL3.5-38B
2026.05
21.21
InternVL3.5-8B
2026.05
20.44
Qwen2-VL-7B
2026.05
19.35
GeoChat
2026.05
17.17
Claude 3.7 Sonnet
2026.05
15.37
GeoEyes
2026.05
9.53
GeoLLaVa-8k
2026.05
8.85
Feedback
Search any
task
Search any
task