Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Codebase generation on Financial Tracking App
Loading...
62
Feature Completeness
Code-L2MAC
19.776
30.738
41.7
52.662
Oct 2, 2023
Feature Completeness
Error Rate
Lines of Code (LOC)
Tests Passed Count
Code Coverage
Updated 1mo ago
Evaluation Results
Method
Method
Links
Feature Completeness
Error Rate
Lines of Code (LOC)
Tests Passed Count
Code Coverage
Code-L2MAC
2023.10
62
0
307
12
90.5
GPT4
2023.10
26.2
0.0513
80.5
2.13
93.1
AutoGPT
2023.10
25
0
0
0
0
Self-Refine
2023.10
23.6
0.25
87.2
0.55
76.7
Reflexion
2023.10
22.5
0.2
86.8
2.7
92.8
CodeT
2023.10
21.4
0
65.9
2.25
97.9
Feedback
Search any
task
Search any
task