Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Question Answering and Commonsense Reasoning on LM Eval ARCC, ARCE, HellaSwag, PIQA (test)

61.6ARCC

Meta 'FP8'

33.72840.96448.255.436Jun 17, 2024
Updated 4d ago

Evaluation Results

MethodLinks
2024.06
61.681.467.183.8
2024.06
61.581.466.883.5
2024.06
61.380.966.784.2
2024.06
60.781.165.482.2
2024.06
56.775.661.582.8
2024.06
56.375.861.483
2024.06
55.175.160.882.6
2024.06
54.472.659.482.5
2024.06
52.772.260.282.6
2024.06
51.677.857.780
2024.06
50.77857.580.1
2024.06
50.477.756.979.3
2024.06
46.275.454.478.7
2024.06
45.175.654.579
2024.06
43.574.351.975.1
2024.06
43.374.352.275.7
2024.06
3668.545.274.2
2024.06
34.868.444.573.3