Commonsense Question Answering on WinoGrande (WG) (val)

78.3Accuracy

DeBERTa-v3-L (CANDLE Distilled)

Updated 4mo ago

Evaluation Results

Method	Links
DeBERTa-v3-L (CANDLE Distilled) 2024.01		78.3
CAR-DeBERTa-v3-L 2024.01		78.2
CAR-DeBERTa-v3-L 2024.01		78.1
GPT-4 2024.01		77
DeBERTa-v3-L (MR) 2024.01		76
Mistral-v0.1 2024.01		75.3
LLAMA2 2024.01		72.8
DeBERTa-v3-L (MR) 2024.01		71.7
VERA-T5-xxl (CANDLE Distilled) 2024.01		71.3
LLAMA2 2024.01		69.2
VERA-T5-xxl 2024.01		68.1
VERA-T5-xxl 2024.01		67.5
VERA-T5-xxl 2024.01		67.2
ChatGPT + Self-consistent chain-of-thought 2024.01		64.1
ChatGPT + Chain-of-thought 2024.01		63.6
ChatGPT 2024.01		62.8
GPT 3.5 2024.01		60.7
STL-Adapter 2024.01		60.3
ROBERTa-L 2024.01		57.5
Self-talk 2024.01		54.7
DeBERTa-v3-L 2024.01		50.3