Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
High-Resolution Visual Understanding on HR-Bench-4K (Pass@1)
Loading...
0.833
Pass@1
SFT + AXPO
0.71132
0.74291
0.7745
0.80609
May 27, 2026
Pass@1
Updated 6d ago
Evaluation Results
Method
Method
Links
Pass@1
SFT + AXPO
Base Model=Qwen3-VL-8B...
2026.05
0.833
PyVision-RL
Base Model=Qwen2.5-VL-7B
2026.05
0.781
DeepEyes-v2
Base Model=Qwen2.5-VL-7B
2026.05
0.779
Mini-o3
Base Model=Qwen2.5-VL-7B
2026.05
0.775
Thyme
Base Model=Qwen2.5-VL-7B
2026.05
0.77
DeepEyes
Base Model=Qwen2.5-VL-7B
2026.05
0.751
PixelReasoner
Base Model=Qwen2.5-VL-7B
2026.05
0.74
Qwen3-VL-8B-Thinking (Agent)
Base Model=Qwen3-VL-8B...
2026.05
0.728
Qwen2.5-VL-7B-Instruct
Base Model=Qwen2.5-VL-7B
2026.05
0.716
Feedback
Search any
task
Search any
task