Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Language Understanding on Llama-3.1-70B Evaluation Suite (MMLU, WinoGrande, HellaSwag, ARC)

78.58MMLU

FP16

24.260838.362952.46566.5671Dec 19, 2025Jan 14, 2026Feb 10, 2026Mar 8, 2026Apr 4, 2026Apr 30, 2026May 27, 2026
Updated 6d ago

Evaluation Results

MethodLinks
2025.12
78.5879.6485.0386.6664.9378.97
2026.05
73.17275.886.264.674.34
2025.12
73.0776.1680.8382.257.1773.89
2025.12
71.2176.6479.4382.4156.0673.15
2025.12
68.1574.975.3779.4252.7370.11
2026.05
62.666.167.181.955.566.63
2025.12
61.4559.5952.8348.8228.6750.27
2026.05
46.163.160.375.645.458.1
2026.05
32.761.256.37443.153.46
2026.05
27.154.445.46634.245.44
2025.12
26.753.5950.6725.1324.8336.58
2026.05
26.660.557.573.141.551.84
2025.12
26.3553.0449.0448.9529.5241.38