Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Common Sense Reasoning on PIQA (dev)

83.2Accuracy

Megatron-NLG

71.749674.722377.69580.6677Dec 13, 2021Aug 31, 2022May 20, 2023Feb 5, 2024Oct 24, 2024Jul 12, 2025Mar 31, 2026
Updated 17d ago

Evaluation Results

MethodLinks
2021.12
83.2
2021.12
82.3
2021.12
81.8
2021.12
81.8
2021.12
81.4
2021.12
81
2021.12
80.5
2021.12
80.4
2023.06
76.07
2023.06
75.11
2026.03
74.37
2026.03
74.37
2026.03
74.1
2026.03
74.1
2026.03
73.78
2026.03
73.12
2026.03
72.91
2026.03
72.91
2023.06
72.19