Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Code Generation on APPS Interview
Loading...
2.64
Pass@1
Codex
-0.05256
0.64647
1.3455
2.04453
Jul 7, 2021
Sep 8, 2021
Nov 10, 2021
Jan 12, 2022
Mar 16, 2022
May 18, 2022
Jul 21, 2022
Pass@1
Pass@1 (incl. timeouts)
Pass@5
Pass@5 (incl. timeouts)
Pass@100
Pass@100 (incl. timeouts)
Pass@1000
Pass@1000 (incl. timeouts)
Pass@2
Pass@10
Pass@50
PR
Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Pass@1
Pass@1 (incl. timeouts)
Pass@5
Pass@5 (incl. timeouts)
Pass@100
Pass@100 (incl. timeouts)
Pass@1000
Pass@1000 (incl. timeouts)
Pass@2
Pass@10
Pass@50
PR
Accuracy
Codex
Backbone=Codex-12B, Fi...
2021.07
2.64
5.78
3.23
7.13
-
-
-
-
-
-
-
-
-
GPT-Neo
Backbone=GPT-Neo 2.7B,...
2021.07
0.57
-
0.8
-
-
-
-
-
-
-
-
-
-
Codex
Backbone=Codex-12B, Fi...
2021.07
0.14
0.3
0.51
1.02
2.04
7.94
3.7
-
-
-
-
-
-
CODET
Model=code-davinci-002...
2022.07
0.081
-
-
-
-
-
-
-
0.112
0.181
-
-
-
Baseline
Model=code-davinci-002...
2022.07
0.051
-
-
-
-
-
-
-
-
0.128
0.23
-
-
Feedback
Search any
task
Search any
task