Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Multiple-choice reasoning on FlameBench
Loading...
32.64
Accuracy
GLM-4
14.9184
19.5192
24.12
28.7208
Feb 27, 2026
Accuracy
Updated 27d ago
Evaluation Results
Method
Method
Links
Accuracy
GLM-4
2026.02
32.64
Gemini Pro
2026.02
32.1
DeepSeek-R1
2026.02
28.37
GPT-5
2026.02
15.6
Feedback
Search any
task
Search any
task