Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Knowledge Preservation and Reasoning on MMLU
Loading...
61.46
MMLU Score
Base Model (Llama3.2-3B)
23.4064
33.2857
43.165
53.0443
Jan 29, 2026
MMLU Score
Updated 4d ago
Evaluation Results
Method
Method
Links
MMLU Score
Base Model (Llama3.2-3B)
Backbone=Llama 3.2-3B-...
2026.01
61.46
DUET
Backbone=Llama 3.2-3B-...
2026.01
61.45
GA (DQA_f) + KL (Dr)
Backbone=Llama 3.2-3B-...
2026.01
60.62
NPO (DQA_f) + KL (Dr)
Backbone=Llama 3.2-3B-...
2026.01
60.55
NPO (DQA_f)
Backbone=Llama 3.2-3B-...
2026.01
60.48
Refusal-Training
Backbone=Llama 3.2-3B-...
2026.01
60.48
SimNPO
Backbone=Llama 3.2-3B-...
2026.01
60.4
GA + KL (Dr)
Backbone=Llama 3.2-3B-...
2026.01
60.18
NPO + KL (Dr)
Backbone=Llama 3.2-3B-...
2026.01
59.47
FLAT
Backbone=Llama 3.2-3B-...
2026.01
58.92
NPO
Backbone=Llama 3.2-3B-...
2026.01
54.79
GA (DQA_f)
Backbone=Llama 3.2-3B-...
2026.01
36.45
GA
Backbone=Llama 3.2-3B-...
2026.01
24.87
Feedback
Search any
task
Search any
task