Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Visual Reasoning on HRBench-4K
Loading...
91.38
Accuracy
S1-VL-32B-RL
60.96
68.8575
76.755
84.6525
Apr 23, 2026
Accuracy
Updated 1mo ago
Evaluation Results
Method
Method
Links
Accuracy
S1-VL-32B-RL
Category=Thinking-with...
2026.04
91.38
S1-VL-32B-SFT
Category=Thinking-with...
2026.04
85
Gemini 2.5 Pro
Category=Proprietary M...
2026.04
83.9
Qwen3-VL-235B-A22B-Thinking
Category=Open-Source M...
2026.04
83
Skywork-R1V4-30B
Category=Thinking-with...
2026.04
82.8
Qwen3-VL-32B-Thinking
Category=Open-Source M...
2026.04
82.63
Intern-S1 (235B+6B)
Category=Open-Source M...
2026.04
82.5
Gemini 2.5 Flash
Category=Proprietary M...
2026.04
77.5
Thyme-VL (7B)
Category=Thinking-with...
2026.04
77
GPT-5
Category=Proprietary M...
2026.04
74.25
Qwen2.5-VL-32B
Category=Open-Source M...
2026.04
73.4
InternVL3-8B
Category=Open-Source M...
2026.04
70
Qwen2.5-VL-7B
Category=Open-Source M...
2026.04
68.8
Intern-S1-mini (8B)
Category=Open-Source M...
2026.04
62.13
Feedback
Search any
task
Search any
task