Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

AGI Eval

Benchmarks

Task NameDataset NameSOTA ResultTrend
General ReasoningAGI Eval English
Score90.1
32
General IntelligenceAGI Eval
AGI Eval Score40.2
24
ReasoningAGI Eval EN
Accuracy89.4
15
ReasoningAGI Eval
Avg@1 (AGI Eval Reasoning)73.4
12
Mathematical ReasoningAGI-Eval Math
Overall Accuracy94.7
11
General Intelligence EvaluationAGI-Eval
Accuracy61.2
10
General Intelligence EvaluationAGI Eval English
Score92.2
8
Text-to-Image GenerationAGI-Eval text-to-image arena 6
ELO Score0.4859
6
Showing 8 of 8 rows