Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Multilingual Reasoning on MMLU-ProX (1k Stratified Subset Test)
Loading...
18.7
Accuracy
Task Arithmetic
16.308
16.929
17.55
18.171
Feb 9, 2026
Accuracy
Average Score
Updated 1mo ago
Evaluation Results
Method
Method
Links
Accuracy
Average Score
Task Arithmetic
Fusion Setting=Math +...
2026.02
18.7
46.4
Weight Average
Fusion Setting=Math +...
2026.02
18.5
46.4
SAE (Local)
Fusion Setting=Math +...
2026.02
18.2
47.8
Rankmean
Fusion Setting=Math +...
2026.02
17.6
15.7
SAE (Global)
Fusion Setting=Math +...
2026.02
17
48.4
PSO
Fusion Setting=Math +...
2026.02
16.4
47.2
Feedback
Search any
task
Search any
task