Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Instruction Tuning on MMLU, BBH, GSM, TydiQA, HumanEval, and AlpacaEval Suite

55.7MMLU

Alpaca-GPT4

44.98847.76950.5553.331Feb 26, 2024
Updated 4d ago

Evaluation Results

MethodLinks
2024.02
55.746.630.548.140.846.544.7-
2024.02
55.748.93354.142.248.847.12.4
2024.02
55.348.53250.841.347.345.91.2
2024.02
55.347.330.551.339.846.245.10.4
2024.02
55.248.33152.242.546.345.91.2
2024.02
55.147.531.552.340.246.145.50.8
2024.02
54.645.330.551.134.142.643-1.7
2024.02
54.147.331.550.641.346.345.20.5
2024.02
54.147.332.552.643.348.346.31.6
2024.02
47.440.616.847.429.435.736.22.2
2024.02
47.337.416.145.328.435.835.11
2024.02
4739.616.547.128.634.435.51.5
2024.02
46.939.415.346.728.235.735.41.3
2024.02
46.938.116.148.426.935.335.31.2
2024.02
46.836.514.544.628.935.534.50.4
2024.02
46.538.41543.426.834.234.1-
2024.02
45.93914.546.427.535.434.80.7
2024.02
45.437.514.345.124.633.133.3-0.7