Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Zero-shot Commonsense Reasoning on ARC-Easy, ARC-Challenge, SIQA, PIQA, and WinoGrande
Loading...
66.1
Reasoning Accuracy
LLAMA-2
37.708
45.079
52.45
59.821
Mar 12, 2024
Reasoning Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Reasoning Accuracy
LLAMA-2
Model Size=13B
2024.03
66.1
BTX
Active Experts (Top-k)...
2024.03
63.7
BTX
Active Experts (Top-k)=2
2024.03
63.5
LLAMA-2
Model Size=7B
2024.03
63.3
Dense
Training Context=Data-...
2024.03
63.3
Sparse upcycling
Training Context=Data-...
2024.03
62.3
BTM
Active Experts (Top-k)=2
2024.03
61.2
BTM
Active Experts (Top-k)=1
2024.03
61
CODELLAMA
Model Size=7B
2024.03
56.6
LLEMMA
Model Size=7B
2024.03
38.8
Feedback
Search any
task
Search any
task