Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Language Understanding on MMLU (First-Token Accuracy)

79.7MMLU First-Token Accuracy

Phi-4-14B

32.06844.43456.869.166May 21, 2025
Updated 12d ago

Evaluation Results

MethodLinks
2025.05
79.7
2025.05
78.8
2025.05
76.4
2025.05
72.1
2025.05
70.2
2025.05
69.1
2025.05
68.9
2025.05
68.4
2025.05
68.3
2025.05
66
2025.05
65.1
2025.05
63.9
2025.05
63.7
2025.05
63.1
2025.05
62.1
2025.05
58.8
2025.05
58.1
2025.05
57.5
2025.05
57
2025.05
55.9
2025.05
52.4
2025.05
48.1
2025.05
45.9
2025.05
33.9