Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Real-world perception-centric reasoning on Real-world perception-centric reasoning suite (test)
Loading...
55.53
Average Score
GLM-9B-DeltaThinker
39.826
43.903
47.98
52.057
May 15, 2026
Average Score
Updated 16d ago
Evaluation Results
Method
Method
Links
Average Score
GLM-9B-DeltaThinker
Algorithm=DeltaPrompts
2026.05
55.53
Qwen-8B-DeltaThinker
Algorithm=DeltaPrompts
2026.05
54.12
GLM-4.1V-9B-Thinking
2026.05
49.75
Vision-R1-7B
2026.05
48.67
REVisual-R1
2026.05
48.06
Qwen3-VL-8B-Thinking
2026.05
46.92
ARES-RL-7B
2026.05
46.45
Bee-8B-RL
2026.05
40.43
Feedback
Search any
task
Search any
task