Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

AGI Eval

Benchmarks

Task NameDataset NameSOTA ResultTrend
General ReasoningAGI Eval English
Score90.1
32
General IntelligenceAGI Eval
AGI Eval Score40.2
24
ReasoningAGI Eval EN
Accuracy89.4
15
General Intelligence EvaluationAGI Eval English
Score92.2
8
Text-to-Image GenerationAGI-Eval text-to-image arena 6
ELO Score0.4859
6
General Intelligence EvaluationAGI-Eval
Accuracy44.2
2
Showing 6 of 6 rows