Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Multi-task Language Understanding on MMLU Pro (test)

65.1History Score

NExt

14.1427.3740.653.83Apr 13, 2026
Updated 4d ago

Evaluation Results

MethodLinks
2026.04
65.132.184.747.969.871.16762.5
2026.04
64.232.383.144.264.772.464.960.8
2026.04
63.633.383.847.769.371.568.862.6
2026.04
62.832.185.245.167.573.666.961.9
2026.04
61.135.283.449.169.87165.462.1
2026.04
51.131.370.64158.764.253.953
2026.04
49.431.170.74159.461.654.452.5
2026.04
48.332.471.541.559.263.955.453.2
2026.04
47.431.87039.558.761.854.451.9
2026.04
46.63170.336.353.661.352.250.2
2026.04
36.918.559.935.541.254.235.640.3
2026.04
35.320.360.835.840.254.236.540.4
2026.04
34.120.560.635.541.754.236.840.5
2026.04
32.418.35535.73651.937.538.1
2026.04
32.419.660.83341.952.837.939.8
2026.04
19.316.538.716.920.738.827.225.4
2026.04
18.819.444.719.821.239.826.427.2
2026.04
17.618.345.220.92241.126.127.3
2026.04
16.28.529.114.516.121.71817.7
2026.04
16.11740.620.91840.326.725.7