Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Language Understanding on MMLU-ProX (T1)
Loading...
26.4
Accuracy
Naive Fine-tuning
-0.63792
6.38154
13.401
20.42046
Apr 22, 2026
Accuracy
Updated 1mo ago
Evaluation Results
Method
Method
Links
Accuracy
Naive Fine-tuning
Backbone=Phi-4-Mini-In...
2026.04
26.4
Full Retraining
Backbone=Phi-4-Mini-In...
2026.04
26.1
EWC
Backbone=Phi-4-Mini-In...
2026.04
25.9
COMPASS-ECDA
Backbone=Phi-4-Mini-In...
2026.04
25.8
Random Rehearsal
Backbone=Phi-4-Mini-In...
2026.04
25.7
Full Retraining
Model=Qwen2.5-7B-Instruct
2026.04
0.408
Naive Fine-tuning
Model=Qwen2.5-7B-Instruct
2026.04
0.406
COMPASS-ECDA
Model=Qwen2.5-7B-Instruct
2026.04
0.405
EWC
Model=Qwen2.5-7B-Instruct
2026.04
0.402
Random Rehearsal
Model=Qwen2.5-7B-Instruct
2026.04
0.402
Feedback
Search any
task
Search any
task