Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Multiple Choice Question Answering on MMLU English (test)

89Accuracy

OpenAI-o3-mini

-0.4422.784669.22Feb 18, 2025
Updated 20d ago

Evaluation Results

MethodLinks
2025.02
89-
2025.02
86-
2025.02
84-
2025.02
81-
2025.02
78-
2025.02
75-
2025.02
73-
2025.02
72-
2025.02
68-
2025.02
66-
2025.02
6526.97
2025.02
64-
2025.02
6016.67
2025.02
59-
2025.02
5830.95
2025.02
5137.04
2025.02
49-
2025.02
40-
2025.02
4053.49
2025.02
38-
2025.02
3257.33
2025.02
32-
2025.02
3061.54
2025.02
2927.5
2025.02
2957.35
2025.02
2659.38
2025.02
1869.49
2025.02
1678.08
2025.02
1171.05
2025.02
986.36
2025.02
785.71
2025.02
390.63