Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Codebase generation on Online Social Media App (test)
Loading...
82.4
Feature Completion (%)
Code-L2MAC
10.744
29.347
47.95
66.553
Oct 2, 2023
Feature Completion (%)
Error Count
LOC
Test Success Rate
Updated 1mo ago
Evaluation Results
Method
Method
Links
Feature Completion (%)
Error Count
LOC
Test Success Rate
Code-L2MAC
2023.10
82.4
0
395
1,830
AutoGPT
Backbone=GPT-4
2023.10
33.3
0.6
148
300
CodeT
2023.10
19.5
16.4
110
181
Reflexion
2023.10
19.5
10.2
10.5
0
Self-Refine
2023.10
15.2
2.53
122
133
GPT4
2023.10
13.5
4.09
116
81.8
Feedback
Search any
task
Search any
task