Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Commonsense Reasoning on PIQA (test)

90.1Accuracy

UNICORN

53.315262.865172.41581.9649Oct 15, 2021Jul 4, 2022Mar 23, 2023Dec 10, 2023Aug 28, 2024May 17, 2025Feb 3, 2026
Updated 3d ago

Evaluation Results

MethodLinks
2022.01
90.1
2022.01
83.19
2022.01
82.3
2022.01
81.99
2022.01
81.8
2023.01
81.07
2022.01
81
2022.01
80.96
2023.01
80.63
2022.01
80.5
2023.01
79.54
2023.01
79.54
2021.10
79.27
78.94
2023.04
76
2023.04
75.2
2023.04
73.9
2026.02
73.4
2026.02
73.4
2026.02
72.5
71.49
2023.04
71.1
2025.10
71
2021.10
70.84
2023.04
70.7
2025.10
69
2025.10
68
2025.10
67
2023.04
66.8
2021.10
66.32
2026.02
66.1
2026.02
63.7
63.44
2026.02
63.1
2021.10
62.89
2023.04
62.7
2021.10
60.45
2025.05
59.85
2023.04
59.5
2025.05
59.19
2025.05
59.03
2025.05
58.92
2025.05
58.27
2025.05
57.73
2021.10
57.45
2023.01
54.73