Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Real-world perception-centric reasoning on HRBench 4K (test)

80.25Accuracy

GLM-9B-DeltaThinker

59.34664.77370.275.627May 15, 2026
Updated 16d ago

Evaluation Results

MethodLinks
2026.05
80.25
2026.05
77.25
75.38
73.75
72.22
2026.05
72.13
71.88
2026.05
60.15