Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Zero-shot Reasoning on Reasoning Suite (ARC-e, ARC-c, HellaSwag, PIQA, Winogrande)

0.7559ARC-e Accuracy

LLaMA-2 (FP16)

0.5242920.5844210.644550.704679Dec 2, 2025
Updated 4d ago

Evaluation Results

MethodLinks
2025.12
0.75590.43170.57060.77910.69850.6472
2025.12
0.72730.39760.53330.76170.68030.62
2025.12
0.63680.32760.49550.74760.65670.5728
2025.12
0.58460.31060.45210.71490.59190.5308
2025.12
0.56560.24820.38190.70080.53670.4866
2025.12
0.55930.24150.38430.6980.55170.487
2025.12
0.55560.28840.42940.71380.62430.5223
2025.12
0.53320.2270.35570.66810.52640.4621