Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Visual Search and Perception-intensive Reasoning on HR-Bench 8K
Loading...
70
Score
Starve to Perceive
24.76
36.505
48.25
59.995
May 18, 2026
Score
Updated 15d ago
Evaluation Results
Method
Method
Links
Score
Starve to Perceive
Visual Bandwidth Const...
2026.05
70
PixelReasoner
Visual Bandwidth Const...
2026.05
69.25
DeepEyes
Visual Bandwidth Const...
2026.05
67.88
Qwen2.5-VL + BA-SFT + VanillaRL
Visual Bandwidth Const...
2026.05
67.75
ChainOfFocus
Visual Bandwidth Const...
2026.05
67.25
Qwen2.5-VL + BA-SFT
Visual Bandwidth Const...
2026.05
65.88
GPT-4o
Visual Bandwidth Const...
2026.05
63.9
Qwen2.5-VL
Visual Bandwidth Const...
2026.05
63
Starve to Perceive
Visual Bandwidth Const...
2026.05
60.38
Qwen2.5-VL + BA-SFT + VanillaRL
Visual Bandwidth Const...
2026.05
60
LLaVA-OneVision
Visual Bandwidth Const...
2026.05
59.8
Qwen2.5-VL + BA-SFT
Visual Bandwidth Const...
2026.05
56.13
ChainOfFocus
Visual Bandwidth Const...
2026.05
53.75
DeepEyes
Visual Bandwidth Const...
2026.05
51.25
Mini-o3
Visual Bandwidth Const...
2026.05
47.75
PixelReasoner
Visual Bandwidth Const...
2026.05
43.13
Mini-o3
Visual Bandwidth Const...
2026.05
40.88
Qwen2.5-VL
Visual Bandwidth Const...
2026.05
26.5
Feedback
Search any
task
Search any
task