Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Math Word Problem solving on ASDiv-A (5-fold cross-val)
Loading...
87.5
Accuracy
MSAT-DEDUCTREASONER
71.276
75.488
79.7
83.912
Jun 2, 2023
Accuracy
Performance Change (Delta)
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
Performance Change (Delta)
MSAT-DEDUCTREASONER
Input Configuration=di...
2023.06
87.5
2.5
DEDUCTREASONER
Input Configuration=sy...
2023.06
85
-
DEDUCTREASONER
Input Configuration=di...
2023.06
84.1
-0.9
MSAT-ROBERTAGEN
Input Configuration=di...
2023.06
81.8
9.7
Large language models w/ Chain-of-Thought prompting
Backbone=code-davinci-...
2023.06
80.4
-
ROBERTAGEN
Input Configuration=sy...
2023.06
72.1
-
ROBERTAGEN
Input Configuration=di...
2023.06
71.9
-0.2
Feedback
Search any
task
Search any
task