Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Multitask Language Understanding on MMLU
Loading...
4.46
Average Relative Improvement
TBDF
-0.8128
0.5561
1.925
3.2939
Jan 29, 2026
Average Relative Improvement
Superior/Inferior Ratio
Updated 4d ago
Evaluation Results
Method
Method
Links
Average Relative Improvement
Superior/Inferior Ratio
TBDF
Filtering Mode=FW-EDU,...
2026.01
4.46
-
TBDF
Filtering Mode=General...
2026.01
1.52
-
CB
Filtering Mode=FW-EDU,...
2026.01
-0.61
-
Feedback
Search any
task
Search any
task