Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Visual Search and Perception-intensive Reasoning on HR-Bench 4K
Loading...
76.63
Overall Score
Starve to Perceive
36.4548
46.8849
57.315
67.7451
May 18, 2026
Overall Score
Updated 15d ago
Evaluation Results
Method
Method
Links
Overall Score
Starve to Perceive
Visual Bandwidth Const...
2026.05
76.63
Qwen2.5-VL + BA-SFT + VanillaRL
Visual Bandwidth Const...
2026.05
75.25
Qwen2.5-VL + BA-SFT
Visual Bandwidth Const...
2026.05
72.5
ChainOfFocus
Visual Bandwidth Const...
2026.05
71.63
DeepEyes
Visual Bandwidth Const...
2026.05
71.5
PixelReasoner
Visual Bandwidth Const...
2026.05
71.13
GPT-4o
Visual Bandwidth Const...
2026.05
68
Qwen2.5-VL + BA-SFT + VanillaRL
Visual Bandwidth Const...
2026.05
66.25
Starve to Perceive
Visual Bandwidth Const...
2026.05
66.13
Qwen2.5-VL
Visual Bandwidth Const...
2026.05
65.25
LLaVA-OneVision
Visual Bandwidth Const...
2026.05
63
ChainOfFocus
Visual Bandwidth Const...
2026.05
59.63
Qwen2.5-VL + BA-SFT
Visual Bandwidth Const...
2026.05
59.25
Mini-o3
Visual Bandwidth Const...
2026.05
57.75
DeepEyes
Visual Bandwidth Const...
2026.05
57.13
Mini-o3
Visual Bandwidth Const...
2026.05
51.63
PixelReasoner
Visual Bandwidth Const...
2026.05
49.38
Qwen2.5-VL
Visual Bandwidth Const...
2026.05
38
Feedback
Search any
task
Search any
task