Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Multitask Language Understanding on MMLU (Accuracy and AVERAGE MEAN)

50.12Accuracy

Full-data Fine-tuning

45.07646.385547.69549.0045Oct 8, 2025
Updated 19d ago

Evaluation Results

MethodLinks
2025.10
50.1248.99
2025.10
49.3348.56
2025.10
49.2348.27
2025.10
48.1246.6
2025.10
46.745.56
2025.10
46.1342.61
2025.10
46.1245.41
2025.10
45.8445.14
2025.10
45.7342.02
2025.10
45.643.43
2025.10
45.2742.17