Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Utility Evaluation on MMLU
Loading...
0.2
ΔMMLU
CNT
-33.7352
-24.9251
-16.115
-7.3049
Mar 19, 2026
ΔMMLU
Updated 1mo ago
Evaluation Results
Method
Method
Links
ΔMMLU
CNT
Model=DeepSeek-V2-Lite...
2026.03
0.2
CNT
Model=Mistral-Nemo-Ins...
2026.03
0.12
RD
Model=DeepSeek-V2-Lite
2026.03
0.01
CNT
Model=Llama3-8B, Trans...
2026.03
-0.18
RD
Model=Llama3.2-3B
2026.03
-0.2
RD
Model=Mistral-Nemo-Ins...
2026.03
-0.44
RD
Model=Llama3-8B
2026.03
-0.65
CNT
Model=Llama3.2-3B, Tra...
2026.03
-0.81
CNT
Model=Average, Transfe...
2026.03
-1.37
RD
Model=Average
2026.03
-1.38
RD
Model=Yi-1.5-6B
2026.03
-5.64
CNT
Model=Yi-1.5-6B, Trans...
2026.03
-6.21
ActSVD-OP
Model=Llama3.2-3B
2026.03
-6.62
ActSVD-OP
Model=Yi-1.5-6B
2026.03
-7.24
ActSVD-OP
Model=Mistral-Nemo-Ins...
2026.03
-7.66
ActSVD-OP
Model=Average
2026.03
-13.49
ActSVD-OP
Model=Llama3-8B
2026.03
-32.43
Feedback
Search any
task
Search any
task