Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Informal to Formal Translation on Instruction Induction (test)
Loading...
61.26
Mean Score
PE2
48.3328
51.6889
55.045
58.4011
Nov 9, 2023
Mean Score
Std Dev
Updated 4d ago
Evaluation Results
Method
Method
Links
Mean Score
Std Dev
PE2
2023.11
61.26
4.73
APE
2023.11
59.53
3.37
APO
2023.11
54.1
10.61
Iter. APE
2023.11
48.83
5.83
Feedback
Search any
task
Search any
task