Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Real-world perception-centric reasoning on RealWorldQA (test)

77.04Accuracy

GLM-9B-DeltaThinker

63.852867.276470.774.1236May 15, 2026
Updated 16d ago

Evaluation Results

MethodLinks
2026.05
77.04
2026.05
75.82
73.07
72.19
2026.05
72.03
67.58
2026.05
66.67
64.36