Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Commonsense Question Answering on WinoGrande (WG) (val)
Loading...
78.3
Accuracy
DeBERTa-v3-L (CANDLE Distilled)
49.18
56.74
64.3
71.86
Jan 14, 2024
Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
DeBERTa-v3-L (CANDLE Distilled)
Zero-shot=true, CSKB=C...
2024.01
78.3
CAR-DeBERTa-v3-L
Zero-shot=true, CSKB=A...
2024.01
78.2
CAR-DeBERTa-v3-L
Zero-shot=true, CSKB=A...
2024.01
78.1
GPT-4
Zero-shot=true, Varian...
2024.01
77
DeBERTa-v3-L (MR)
Zero-shot=true, CSKB=A...
2024.01
76
Mistral-v0.1
Zero-shot=true, Model...
2024.01
75.3
LLAMA2
Zero-shot=true, Model...
2024.01
72.8
DeBERTa-v3-L (MR)
Zero-shot=true, CSKB=A...
2024.01
71.7
VERA-T5-xxl (CANDLE Distilled)
Zero-shot=true, CSKB=C...
2024.01
71.3
LLAMA2
Zero-shot=true, Model...
2024.01
69.2
VERA-T5-xxl
Zero-shot=true, CSKB=A...
2024.01
68.1
VERA-T5-xxl
Zero-shot=true, CSKB=A...
2024.01
67.5
VERA-T5-xxl
Zero-shot=true, CSKB=A...
2024.01
67.2
ChatGPT + Self-consistent chain-of-thought
Zero-shot=true, Varian...
2024.01
64.1
ChatGPT + Chain-of-thought
Zero-shot=true, Varian...
2024.01
63.6
ChatGPT
Zero-shot=true, Varian...
2024.01
62.8
GPT 3.5
Zero-shot=true, Varian...
2024.01
60.7
STL-Adapter
Zero-shot=true, CSKB=A...
2024.01
60.3
ROBERTa-L
Zero-shot=true
2024.01
57.5
Self-talk
Zero-shot=true
2024.01
54.7
DeBERTa-v3-L
Zero-shot=true
2024.01
50.3
Feedback
Search any
task
Search any
task