Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Science on MMLU-Pro (test)
Loading...
41.9
Accuracy
Base
8.308
17.029
25.75
34.471
May 6, 2026
Accuracy
Updated 27d ago
Evaluation Results
Method
Method
Links
Accuracy
Base
Backbone=Qwen2.5-3B-In...
2026.05
41.9
Base
Backbone=Qwen2.5-3B-In...
2026.05
41.9
Low-SFT
Backbone=Qwen2.5-3B-In...
2026.05
41.7
DFT
Backbone=Qwen2.5-3B-In...
2026.05
41.3
Self-SFT
Backbone=Qwen2.5-3B-In...
2026.05
41.1
Anchored Learning
Backbone=Qwen2.5-3B-In...
2026.05
40.9
Self-sft
Backbone=Qwen2.5-3B-In...
2026.05
39.4
KL-SFT
Backbone=Qwen2.5-3B-In...
2026.05
38.3
SFT
Backbone=Qwen2.5-3B-In...
2026.05
37.5
STM
Backbone=Qwen2.5-3B-In...
2026.05
36.1
Iter-SFT
Backbone=Qwen2.5-3B-In...
2026.05
35.8
Anchored Learning
Backbone=Qwen2.5-3B-In...
2026.05
35.6
Iter-SFT
Backbone=Qwen2.5-3B-In...
2026.05
33.8
Low-SFT
Backbone=Qwen2.5-3B-In...
2026.05
31.7
DFT
Backbone=Qwen2.5-3B-In...
2026.05
29.7
STM
Backbone=Qwen2.5-3B-In...
2026.05
27.7
SFT
Backbone=Qwen2.5-3B-In...
2026.05
10.9
KL-SFT
Backbone=Qwen2.5-3B-In...
2026.05
9.6
Feedback
Search any
task
Search any
task