Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Question Answering on MMLU Redux (Subject-Specific Accuracy)

77.5Business Ethics Accuracy

GPT-5

70.2272.117475.89Mar 19, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.03
77.5807383.581708264.855.55872.5
2026.03
7680788482707862566473
2026.03
75.579.2738382.368.87759.75657.371.2
2026.03
75.37975.58179.869.377.567.35758.572
2026.03
7582738584.570.57964.5576273.3
2026.03
747967.873.38067.877.362.551.358.369.1
2026.03
70.57667.776.27668.272.559.755.55467.6