Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Zero-shot Reasoning on HellaSwag

76.3Accuracy

Llama2-7B

24.830438.192751.55564.9173Jun 15, 2024Sep 24, 2024Jan 4, 2025Apr 16, 2025Jul 27, 2025Nov 6, 2025Feb 16, 2026
Updated 4d ago

Evaluation Results

MethodLinks
2026.02
76.3
2024.06
60.11
2024.06
57.17
2024.09
57.03
2024.09
55.92
2024.09
55.09
2026.02
51.8
2026.02
51.5
2024.09
50.91
2024.06
50.34
2024.06
48.91
2024.06
47.05
2024.06
46.62
2024.06
43.64
2024.09
42.64
2024.06
42.44
2024.06
40.63
2026.02
39.6
2024.06
37.96
2024.06
35.76
2024.06
35.61
2026.02
35.4
2024.09
34.65
2024.06
32.63
2024.06
30.94
2024.06
29.74
2026.02
28.8
2026.02
28.4
2024.06
26.81