Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Command-to-action mapping on SCAN (length)
Loading...
99.7
Accuracy
Least-to-Most
-3.988
22.931
49.85
76.769
May 21, 2022
Jul 23, 2022
Sep 24, 2022
Nov 27, 2022
Jan 29, 2023
Apr 2, 2023
Jun 5, 2023
Accuracy
Updated 3d ago
Evaluation Results
Method
Method
Links
Accuracy
Least-to-Most
model=code-davinci-002
2022.05
99.7
vNQ¹
constraint=hard constr...
2023.06
95.7
PModel
constraint=hard constr...
2023.06
91.72
Least-to-Most
model=text-davinci-002...
2022.05
76
Least-to-Most
model=code-davinci-001
2022.05
60.7
Standard prompting
model=code-davinci-002
2022.05
16.7
Chain-of-Thought
model=code-davinci-002
2022.05
16.2
Standard prompting
model=text-davinci-002...
2022.05
6
Standard prompting
model=code-davinci-001
2022.05
0.4
Chain-of-Thought
model=text-davinci-002...
2022.05
0
Chain-of-Thought
model=code-davinci-001
2022.05
0
Feedback
Search any
task
Search any
task