Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Commonsense Reasoning and Question Answering on ARC-e, ARC-c, Winogrande, BoolQ, PIQA, HellaSwag, OBQA, HQA

60.5ARC-e Acc

Pythia

31.17238.78646.454.014Apr 14, 2026
Updated 19d ago

Evaluation Results

MethodLinks
2026.04
60.531.261.361.171.153.633.231.950.491,237.4-632.48
2026.04
57.830.460.460.871.752.633.430.749.731,237.1-632.22
2026.04
48.724.856.858.166.741.626.126.243.63360.9-197.71
2026.04
47.522.255.357.266.140.725.726.642.68360.8-197.57
2026.04
46.324.355.656.863.441.725.224.342.19131.90.18466.89
2026.04
45.824.555.656.163.441.325.524.842.12141.60.411141.21
2026.04
45.723.554.256.362.340.224.52441.3366.70.19237.16
2026.04
44.523.754.255.362.340.424.623.641.0874.20.42674.32
2026.04
43.719.852.855.162.733.620.124.239125.7-126.01
2026.04
43.619.352.354.662.432.120.223.738.6125.6-125.95
2026.04
42.121.452.156.160.533.121.923.538.8484.20.17816.75
2026.04
41.821.952.855.760.43421.323.138.8788.30.37735.35
2026.04
41.521.752.355.159.732.721.223.838.4843.40.1829.31
2026.04
41.521.752.454.959.131.620.823.138.1447.70.40418.61
2026.04
39.41951.25357.529.219.723.136.523.10.1739.43
2026.04
39.118.950.352.756.728.119.822.936.0512.10.1965.24
2026.04
38.918.551.552.95828.319.222.936.2725.80.38619.92
2026.04
38.518.351.352.357.729.119.222.536.1113.70.41210.74
2026.04
35.217.750.747.355.127.617.323.134.2518.30.16816.53
2026.04
32.316.250.245.754.625.315.720.632.583.660.1743.29