Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

AGIEval

Benchmarks

Task NameDataset NameSOTA ResultTrend
Comprehensive ExaminationAGIEval (test)
Accuracy62.3
34
General ReasoningAGIEval
Exact Match70.4
33
Natural Language UnderstandingAGIEval
Accuracy71.6
24
Question AnsweringAGIEval
Vanilla Accuracy43.92
14
Mathematical ReasoningAGIEval MATH
Accuracy95.7
12
Question AnsweringAGIEval
Accuracy32.11
12
Mathematical ReasoningAGIEval-MATH (test)
Accuracy52.1
11
General EvaluationAGIEval
Accuracy70.22
8
General Intelligence EvaluationAGIEval (test)
AGIEval (3-shot)27
8
Question AnsweringAGIEval (test)
AQUA-RAT28.3
5
General Intelligence EvaluationAGIEval G
Accuracy72
4
General Reasoningagieval
Accuracy46.03
4
General Language UnderstandingAGIEval 5-shot
Accuracy80.22
3
Showing 13 of 13 rows