Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Multiple Choice Question Answering on MMLU-Pro (Professional Medicine)

94Accuracy

GPT-4o

36.48851.41966.3581.281Mar 29, 2022Nov 17, 2022Jul 8, 2023Feb 26, 2024Oct 16, 2024Jun 6, 2025Jan 26, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2024.10
94
2024.10
94
2024.10
92
2024.10
92
2026.01
84.3
2026.01
83.19
2026.01
82.74
2026.01
79.93
2026.01
79.54
2026.01
79.54
2026.01
77.92
2026.01
76.94
2026.01
76.61
2026.01
75.05
2026.01
74.53
2026.01
71.53
2026.01
65.73
2026.01
63.13
2022.03
50.7
2022.03
43.2
2022.03
38.7