Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Commonsense Reasoning on 7 Reasoning Datasets (test)

60Overall Average Accuracy

Dense

27.7636.1344.552.87Dec 31, 2024
Updated 4d ago

Evaluation Results

MethodLinks
2024.12
6034827459528039100
2024.12
60--------
2024.12
59--------
2024.12
53--------
2024.12
53--------
2024.12
52--------
2024.12
51--------
2024.12
48--------
2024.12
472260704437692878.6
2024.12
47--------
2024.12
47--------
2024.12
47--------
2024.12
46--------
2024.12
45--------
2024.12
442157664232672674.1
2024.12
44--------
2024.12
44--------
2024.12
44--------
2024.12
44--------
2024.12
422649653330653070.9
2024.12
41--------
2024.12
40--------
2024.12
351543513023582257.7
2024.12
32--------
2024.12
32--------
2024.12
31--------
2024.12
30--------
2024.12
29--------