Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Unified Multi-task Language Understanding and Instruction Following on Open LLM Leaderboard
Loading...
-
MMLU-P (Accuracy)
No plottable results for MMLU-P (Accuracy) (PERCENT).
Metric
MMLU-P (Accuracy) (PERCENT)
GPQA (Accuracy) (PERCENT)
BBH (Accuracy) (PERCENT)
MATH (Exact Match) (PERCENT)
MuSR (Accuracy) (PERCENT)
IFE-I (Strict Accuracy) (PERCENT)
IFE-P (Strict Accuracy) (PERCENT)
Updated 4d ago
Evaluation Results
Method
Method
Links
MMLU-P (Accuracy)
GPQA (Accuracy)
BBH (Accuracy)
MATH (Exact Match)
MuSR (Accuracy)
IFE-I (Strict Accuracy)
IFE-P (Strict Accuracy)
No evaluation results found.
Feedback
Search any
task
Search any
task