Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Cross-lingual Reasoning and Factual Knowledge on Global MMLU (test)
Loading...
23.46
Accuracy (RUS)
w_TREX
22.9296
23.0673
23.205
23.3427
Jan 20, 2026
Accuracy (RUS)
Accuracy (POL)
Accuracy (CMN)
Accuracy (NLD)
Accuracy (DEU)
Accuracy (IND)
Accuracy (JPN)
Accuracy (TUR)
Accuracy (FRA)
Accuracy (CES)
Accuracy (ITA)
Accuracy (FAS)
Accuracy (POR)
Accuracy (SWE)
Accuracy (ELL)
Accuracy (VIE)
Average Accuracy
Difference
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy (RUS)
Accuracy (POL)
Accuracy (CMN)
Accuracy (NLD)
Accuracy (DEU)
Accuracy (IND)
Accuracy (JPN)
Accuracy (TUR)
Accuracy (FRA)
Accuracy (CES)
Accuracy (ITA)
Accuracy (FAS)
Accuracy (POR)
Accuracy (SWE)
Accuracy (ELL)
Accuracy (VIE)
Average Accuracy
Difference
w_TREX
Model Size=200M, Train...
2026.01
23.46
23.75
23.09
23.69
24.17
26.75
22.95
24.12
23.99
23.17
23.56
22.93
22.95
24.51
23.01
25.14
23.83
0.59
w_llama
Model Size=200M, Train...
2026.01
22.95
23.1
22.91
24.16
22.89
22.95
22.92
22.97
23.31
23.05
22.9
22.97
22.97
24.09
22.92
24.72
23.24
-
Feedback
Search any
task
Search any
task