Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Zero-shot Commonsense Reasoning on BoolQ, PIQA, HellaSwag, and Winogrande

84.9Avg Commonsense Accuracy

U-PaLM

43.154453.992264.8375.6678Oct 20, 2022Mar 1, 2023Jul 12, 2023Nov 22, 2023Apr 2, 2024Aug 13, 2024Dec 24, 2024
Updated 4d ago

Evaluation Results

MethodLinks
2022.10
84.988.884.184.182.6---
2022.10
83.78882.383.481.1---
2022.10
80.785.481.479.776.2---
2022.10
80.584.880.579.777---
2022.10
80.383.781.880.874.9---
2022.10
78.281.881.879.770.1---
2024.12
72.56-------
2024.12
71.92-------
2024.12
71.13-------
2024.12
69.99-------
2024.12
68.16-------
2024.12
68.06-------
2024.12
66.79-------
2024.12
66.37-------
2024.12
65.8578.4178.5674.6870.0943.443.7772.01
2024.12
65.68-------
2024.12
65.5-------
2024.12
63.9175.4177.0972.3468.4343.441.4769.23
2024.12
63.5574.9877.4272.1967.8842.641.4768.31
2024.12
60.9970.0976.9370.0664.0938.240.5367.05
2024.12
59.49-------
2024.12
59.47-------
2024.12
58.866.2175.0366.7463.853838.9162.84
2024.12
58.6361.9676.8869.1863.339.437.8861.83
2024.12
58.269.1775.0365.2561.2536.635.1564.94
2024.12
57.82-------
2024.12
56.3862.8774.4863.0360.9336.236.8660.31
2024.12
53.3461.0471.3358.8757.8536.632.0855.64
2024.12
53.22-------
2024.12
52.8165.8471.2254.0856.8335.631.454.71
2024.05
46.8461.3867.944.7451.73124.8346.38
2024.05
46.6560.4668.9344.5850.9930.22546.38
2024.05
45.4561.3866.4942.2249.6430.624.7443.1
2024.12
44.7640.7667.0846.6453.283427.5643.98