Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Question Answering and Commonsense Reasoning on LM Eval ARCC, ARCE, HellaSwag, PIQA (test)

61.6ARCC

Meta 'FP8'

33.72840.96448.255.436Jun 17, 2024
Updated 1mo ago

Evaluation Results

MethodLinks
2024.06
61.681.467.183.8
2024.06
61.581.466.883.5
2024.06
61.380.966.784.2
2024.06
60.781.165.482.2
2024.06
56.775.661.582.8
2024.06
56.375.861.483
2024.06
55.175.160.882.6
2024.06
54.472.659.482.5
2024.06
52.772.260.282.6
2024.06
51.677.857.780
2024.06
50.77857.580.1
2024.06
50.477.756.979.3
2024.06
46.275.454.478.7
2024.06
45.175.654.579
2024.06
43.574.351.975.1
2024.06
43.374.352.275.7
2024.06
3668.545.274.2
2024.06
34.868.444.573.3