Share your thoughts, 1 month free Claude Pro on usSee more

Real-world perception-centric reasoning on HRBench 4K (test)

80.25Accuracy

GLM-9B-DeltaThinker

Updated 2mo ago

Evaluation Results

Method	Links
GLM-9B-DeltaThinker 2026.05		80.25
Qwen-8B-DeltaThinker 2026.05		77.25
Vision-R1-7B 2026.05		75.38
GLM-4.1V-9B-Thinking 2026.05		73.75
Qwen3-VL-8B-Thinking 2026.05		72.22
ARES-RL-7B 2026.05		72.13
REVisual-R1 2026.05		71.88
Bee-8B-RL 2026.05		60.15