Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
General Language Understanding on MMLU (Accuracy, IR)
Loading...
71.6
Accuracy
Qwen2.5-14B
43.208
50.579
57.95
65.321
Apr 9, 2026
Accuracy
IR Score
Updated 8d ago
Evaluation Results
Method
Method
Links
Accuracy
IR Score
Qwen2.5-14B
Backbone=Qwen2.5-14B,...
2026.04
71.6
4.7
HyTuning
Backbone=Qwen2.5-14B,...
2026.04
71.5
3.7
HyTuning
Backbone=Qwen2.5-7B, P...
2026.04
61.4
8.1
Qwen2.5-7B
Backbone=Qwen2.5-7B, P...
2026.04
61.3
12.6
HyTuning
Backbone=Qwen2.5-3B, P...
2026.04
55.2
9.9
INTUITOR
Backbone=Qwen2.5-3B, P...
2026.04
47.6
21.1
RLPR
Backbone=Qwen2.5-3B, P...
2026.04
45.1
23.9
RLVR
Backbone=Qwen2.5-3B, P...
2026.04
45
24
HPT
Backbone=Qwen2.5-3B, P...
2026.04
45
24
Qwen2.5-3B
Backbone=Qwen2.5-3B, P...
2026.04
44.9
24.1
RD
Backbone=Qwen2.5-3B, P...
2026.04
44.9
24.5
SFT
Backbone=Qwen2.5-3B, P...
2026.04
44.3
25.1
Feedback
Search any
task
Search any
task