Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Codebase generation on Event Planner App
Loading...
62
Feature Completion (%)
Code-L2MAC
9.168
22.884
36.6
50.316
Oct 2, 2023
Feature Completion (%)
Error Rate
LOC (Lines of Code)
Tests Passed Count
Code Coverage (%)
Updated 1mo ago
Evaluation Results
Method
Method
Links
Feature Completion (%)
Error Rate
LOC (Lines of Code)
Tests Passed Count
Code Coverage (%)
Code-L2MAC
2023.10
62
0
473
25.6
97.1
CodeT
2023.10
26.2
0.05
75.2
2.45
92.5
Reflexion
2023.10
23.6
0
82
3
95.1
AutoGPT
2023.10
22.5
0
23.9
0
32.9
Self-Refine
2023.10
21.4
0.15
118
3.9
76.7
GPT4
2023.10
11.2
0.025
74.6
1.75
88.7
Feedback
Search any
task
Search any
task