Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Vision Reasoning on Tunnel Vision, BabyVision, and TRACER-Bench
Loading...
98.3
Overall Score
Human
21.132
41.166
61.2
81.234
May 13, 2026
Overall Score
Updated 15d ago
Evaluation Results
Method
Method
Links
Overall Score
Human
2026.05
98.3
GPT-5.2
2026.05
61.8
Gemini 3.1 Pro
2026.05
56.4
CAVE-7B
Param=7B, Instruct=false
2026.05
33.3
CAVE-3B
Param=3B, Instruct=false
2026.05
30.5
Qwen3-VL
Param=8B, Instruct=true
2026.05
30.1
CAVE Belief-Only
Param=7B, Instruct=false
2026.05
29.6
InternVL3.5
Param=8B, Instruct=true
2026.05
29.2
MiniCPM-V 4.5
Param=8B, Instruct=false
2026.05
29.2
DeepEyes
Param=7B, Instruct=false
2026.05
28.5
ThinkLite-VL
Param=7B, Instruct=false
2026.05
27.4
VLAA-Thinker
Param=7B, Instruct=false
2026.05
27
R1-OneVision
Param=7B, Instruct=false
2026.05
26
Qwen2.5-VL
Param=7B, Instruct=true
2026.05
25.9
Random
2026.05
24.1
Feedback
Search any
task
Search any
task