Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Language Understanding on MMLU-K
Loading...
76.37
Accuracy
ZipCal
18.962
33.866
48.77
63.674
Mar 17, 2026
Accuracy
Delta
Updated 1mo ago
Evaluation Results
Method
Method
Links
Accuracy
Delta
ZipCal
Model=Llama-3.1-8B-Ins...
2026.03
76.37
14.35
ZipCal
Model=gemma-2-9b-it, C...
2026.03
35.19
3.93
ZipCal
Model=Llama-3.1-8B-Ins...
2026.03
31.26
-4.33
ZipCal
Model=gemma-2-9b-it, C...
2026.03
21.17
-1.6
Feedback
Search any
task
Search any
task