Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Humanity's Last Exam

Benchmarks

Task NameDataset NameSOTA ResultTrend
ReasoningHumanity's Last Exam
Accuracy84.61
46
Question AnsweringHumanity's Last Exam
Pass@151.7
16
Expert-level ReasoningHumanity's Last Exam 2,158 text-only
Avg@3 Score54.2
15
Expert-Level Question AnsweringHumanity's Last Exam
Accuracy40.9
14
Complex ReasoningHumanity's Last Exam (HLE)
Pass@1 Score18.4
13
Question AnsweringHumanity's Last Exam (HLE) MCQ
Accuracy19.9
6
Long Context EvaluationHumanity's Last Exam AA-LCR
Accuracy54.3
6
World KnowledgeHUMANITY’S LAST EXAM text-only
Score11.1
4
Showing 8 of 8 rows