Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Zero-shot Reasoning on Reasoning Tasks (PIQA, ARC-e, HS, WG)

80.41PIQA Accuracy

Llama2-13B (W16A16)

73.28675.135576.98578.8345Mar 26, 2026
Updated 22d ago

Evaluation Results

MethodLinks
2026.03
80.4177.479.3772.1477.33-
2026.03
80.4176.7378.372.1476.9-
2026.03
80.4174.7179.0873.0971.0547.95
2026.03
79.9677.2777.9671.9876.79-
2026.03
79.4958.576.3169.370.9-
2026.03
79.4958.1274.7769.6170.5-
2026.03
79.4373.1576.1670.2468.8145.05
2026.03
79.3373.1576.1970.7269.0245.73
2026.03
79.2274.9278.3671.976.1-
2026.03
79.1174.1676.7869.7674.95-
2026.03
78.7373.2758.2268.964.3742.75
2026.03
78.1365.8755.2767.4860.9638.05
2026.03
77.4269.771.7467.9665.4440.36
2026.03
77.2162.5871.0567.5169.59-
2026.03
76.7755.2670.8367.0967.49-
2026.03
76.1367.5169.2566.1269.75-
2026.03
75.8454.2968.3265.5165.99-
2026.03
73.5667.4764.7563.2267.25-