Share your thoughts, 1 month free Claude Pro on us
See more
Feedback
Search any
task
Search any
task
SOTA Zero-shot Reasoning benchmarks and papers with code | Wizwand
Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Tasks
Zero-shot Reasoning
Benchmarks
Dataset Name
SOTA Method
Dataset Name
SOTA Method
Metric
Trend
Results
Last Updated
Reasoning Suite Zero-shot (PIQA, HellaSwag, WinoGrande, ARC-e, ARC-c) (val test)
FP16
Average Accuracy
76.55
297
21d ago
Reasoning Suite (ARC-e, ARC-c, HellaSwag, PIQA, Winogrande) zero-shot
RIA+SQ+VC+EBFT
Average Reasoning Score
6,540
107
1mo ago
9 Common Sense Reasoning Tasks (WinoGrande, SocialIQA, LAMBADA, MMLU, ARC-Easy, ARC-Challenge, HellaSwag, OpenBookQA, PIQA) Average
FloatingPoint
Accuracy
72.7
65
21d ago
PIQA
OPTQ
PIQA Zero-shot Accuracy
80.9
62
1mo ago
Evaluation Suite Zero-shot (OpenbookQA, ARC-e, ARC-c, WinoGrande, HellaSwag, PIQA, MathQA)
Meta-Llama-3-8B Dense
Average Accuracy
69.99
56
3mo ago
ZeroShot 7
FP
Accuracy
71.57
56
9d ago
Reasoning Tasks (BoolQ, PIQA, HellaSwag, WinoGrande, ARC-e, ARC-c, OBQA) Zero-shot
Vicuna-13B-v1.3 (Original)
BoolQ Accuracy (Zero-shot)
82.813
55
2mo ago
Zero-Shot Reasoning Tasks (ARC-C, ARC-E, BoolQ, Hella, OBQA, PIQA, SIQA, Wino)
Baseline
ARC-C Accuracy
65.53
54
1mo ago
WinoGrande
Baseline
Accuracy
70
54
1mo ago
HellaSwag
Llama2-7B
Accuracy
76.3
53
28d ago
ARC-Easy zero-shot
WeDLM-8B
Zero-shot Accuracy
97.43
41
2mo ago
ARC-e, Winogrande, HellaSwag, PIQA
BF16
Normalized Avg Accuracy
77.2
36
3mo ago
MathQA
Llama2-7B
Accuracy
28.4
26
2mo ago
OpenbookQA
Llama2-7B
Accuracy
44
26
2mo ago
Downstream Tasks (LMB, PIQA, HellaSwag, OPQA, ARC)
FLT
LAMBADA (LMB) Accuracy
32.75
22
14d ago
benchmark datasets (PIQA, HeSw, ARC-e, ARC-c, OBQA, Race, WSC273, LAMBADA, MMLU) Zero-shot
LLaMA-2-7B-hf
PIQA
78.07
21
3mo ago
Reasoning Suite PiQA, LAMBDA, ARC, HellaSwag
NSA
PiQA Score
62.69
20
2mo ago
ARC-e, PIQA, OpenbookQA, Winogrande, HellaSwag, MathQA
Baseline
Average Accuracy
57
19
2mo ago
Reasoning Tasks (PIQA, ARC-e, HS, WG) Zero-shot
Llama2-13B (W16A16)
PIQA Accuracy
80.41
18
2mo ago
HellaSwag zero-shot 200 items
HQMQ s192 r6 + Med3x
Accuracy
68
17
6d ago
Multiple Reasoning Datasets Combined
PLDRv51-SOC-110M-5
Average Score 0
42.62
11
2mo ago
Zero-shot Average
BF16 baseline
Accuracy
66
11
3mo ago
Reasoning Suite Zero-shot (PIQA, ARC, HS, WG, BoolQ, MMLU)
SliderQuant
PIQA Accuracy
80.2
9
2mo ago
StoryCloze zero-shot
OPTQ
Accuracy
79.95
8
3mo ago
Reasoning Benchmarks Zero-shot (ARC-C, ARC-E, HellaSwag, LAMBADA, OpenBookQA, PIQA, WinoGrande)
H-Net 1.3B (ours)
ARC-C Accuracy
36.9
6
5d ago
Showing 25 of 30 rows
25 / page
50 / page
100 / page
1
2
Search any
task
Search any
task
Privacy Policy
Terms of Service
FAQs
Swarm Docs