Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
General Language Understanding and Reasoning on MMLU Suite, AGIEval, BBH, ARC, and BoolQ Aggregation
Loading...
-
Accuracy
No plottable results for Accuracy (PERCENT).
Metric
Accuracy (PERCENT)
Exact Match (PERCENT)
Updated 1mo ago
Evaluation Results
Method
Method
Links
Accuracy
Exact Match
No evaluation results found.
Feedback
Search any
task
Search any
task