Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Scientific Question Answering on MMLU-Pro

88.6Accuracy

GPT-5

69.56874.50979.4584.391Feb 16, 2025Mar 19, 2025Apr 20, 2025May 22, 2025Jun 23, 2025Jul 25, 2025Aug 26, 2025
Updated 5d ago

Evaluation Results

MethodLinks
2025.08
88.6--2.1
2025.08
86.6--0.9
2025.08
86.5---
2025.08
86.2--1.2
2025.08
86--1.9
2025.08
85.7---
2025.08
85.3--1.2
2025.08
85--2.9
2025.08
85---
2025.08
84.1---
2025.08
84.1---
2025.08
82.1---
2025.02
73.723.4-
2025.02
71.7---
2025.02
71.5---
2025.02
70.3---