Share your thoughts, 1 month free Claude Pro on usSee more

SOTA General Language Model Capability benchmarks and papers with code | Wizwand

Share your thoughts, 1 month free Claude Pro on usSee more

General Language Model Capability

Benchmarks

Dataset Name	SOTA Method	Metric	Trend
MMLU, GSM8K, HumanEval, BBH Combined	VAR	Average Score68.42		8	26d ago

Showing 1 of 1 rows