Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Commonsense reasoning on WinoGrande 1.0 (test)

0.8137Accuracy

Mistral-7B + DSIR

0.5970680.6533090.709550.765791Feb 12, 2024
Updated 4d ago

Evaluation Results

MethodLinks
2024.02
0.8137
2024.02
0.8122
2024.02
0.8019
2024.02
0.8011
2024.02
0.8003
2024.02
0.7585
2024.02
0.7537
2024.02
0.753
2024.02
0.7466
2024.02
0.7451
2024.02
0.6661
2024.02
0.6661
2024.02
0.6638
2024.02
0.6638
2024.02
0.6054