Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Multiple-choice reasoning on FlameBench

32.64Accuracy

GLM-4

14.918419.519224.1228.7208Feb 27, 2026
Updated 27d ago

Evaluation Results

MethodLinks
2026.02
32.64
2026.02
32.1
28.37
2026.02
15.6