Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Zero-shot Question Answering on Commonsense Reasoning Suite (PIQA, WinoGrande, HellaSwag, ARC)

82.7PIQA Accuracy (Zero-shot)

Dense

78.654479.704780.75581.8053May 23, 2025
Updated 3d ago

Evaluation Results

MethodLinks
2025.05
82.777.9883.8480.9857.3476.57
2025.05
81.4576.1481.0377.9754.6374.24
2025.05
81.3576.7381.2677.6954.5674.32
2025.05
80.3374.2676.9374.8548.4770.97
2025.05
80.1575.6279.8576.4150.2972.46
2025.05
79.5273.1576.3775.1547.8970.42
2025.05
78.8170.2574.8973.5646.4568.79