Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Capability Self-Assessment on MMLU-Pro Science

69.1M-F1

OLMo2-7B

18.55631.67844.857.922May 29, 2026
Updated 1d ago

Evaluation Results

MethodLinks
2026.05
69.110.95106.5
2026.05
67.410.12102.2
2026.05
66.37.8388.3
2026.05
65.78.499.5
2026.05
65.47.0876.9
2026.05
64.48.07101.8
2026.05
63.67.3887.6
2026.05
63.66.9961.1
2026.05
62.46.72101.5
2026.05
62.36.3494
2026.05
627.0799.8
2026.05
61.16.3989.6
2026.05
61.15.6180.8
2026.05
615.5792.5
2026.05
60.85.6583.8
2026.05
60.65.776.2
2026.05
60.45.1295.2
2026.05
59.85.9286.8
2026.05
58.63.9761.8
2026.05
584.1991.5
2026.05
57.43.6369.4
2026.05
56.65.25100.4
2026.05
55.13.44100
2026.05
54.72.7581.2
2026.05
54.46.13100
2026.05
50.210.7349.1
2026.05
49.25.337.3
2026.05
48.70.0386.8
2026.05
46.43.02100
2026.05
45.43.06100
2026.05
45-100
2026.05
45-100
2026.05
44.902
2026.05
44.400.7
2026.05
42.2-100
2026.05
4205
2026.05
40.43.2517.4
2026.05
31.54.4720.8
2026.05
29.50.33100
2026.05
20.52.256.7