Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Long Context Evaluation on Humanity's Last Exam AA-LCR
Loading...
54.3
Accuracy
GLM-4.6
6.772
19.111
31.45
43.789
Jan 14, 2026
Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
GLM-4.6
Thinking Mode=true, Pa...
2026.01
54.3
DeepSeek-V3.1
Thinking Mode=true, Pa...
2026.01
53.3
A.X K1
Thinking Mode=true, Pa...
2026.01
36
GLM-4.6
Thinking Mode=true, Pa...
2026.01
13.3
DeepSeek-V3.1
Thinking Mode=true, Pa...
2026.01
13
A.X K1
Thinking Mode=true, Pa...
2026.01
8.6
Feedback
Search any
task
Search any
task