Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Symbolic Reasoning on Symbolic Longer
Loading...
0.187
Accuracy (Clean, Avg)
SCO
0.0674
0.09845
0.1295
0.16055
Oct 31, 2024
Accuracy (Clean, Avg)
Accuracy (Irrelevant, Easy)
Accuracy (Irrelevant, Medium)
Accuracy (Irrelevant, Hard)
Accuracy (Irrelevant, Avg)
Accuracy (Inaccurate, Easy)
Accuracy (Inaccurate, Medium)
Accuracy (Inaccurate, Hard)
Accuracy (Inaccurate, Avg)
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy (Clean, Avg)
Accuracy (Irrelevant, Easy)
Accuracy (Irrelevant, Medium)
Accuracy (Irrelevant, Hard)
Accuracy (Irrelevant, Avg)
Accuracy (Inaccurate, Easy)
Accuracy (Inaccurate, Medium)
Accuracy (Inaccurate, Hard)
Accuracy (Inaccurate, Avg)
SCO
Additional Information...
2024.10
0.187
0.121
0.105
0.113
0.113
0.152
0.159
0.098
0.136
CD-CoT
Additional Information...
2024.10
0.123
0.12
0.12
0.13
0.123
0.123
0.1
0.11
0.111
CC
Additional Information...
2024.10
0.094
0.098
0.079
0.079
0.085
0.085
0.074
0.065
0.075
Base
2024.10
0.092
0.063
0.072
0.06
0.065
0.07
0.068
0.06
0.066
BT
Additional Information...
2024.10
0.072
0.034
0.035
0.025
0.031
0.038
0.036
0.036
0.037
Feedback
Search any
task
Search any
task