Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Security-aware code generation on LLMSecEval+
Loading...
40
Functional Success @1
Reprompt
30.328
32.839
35.35
37.861
May 16, 2026
Functional Success @1
Security Success @1
Functional Success Rate @1
Functional Success Rate @5
Functional Success Rate @10
Updated 15d ago
Evaluation Results
Method
Method
Links
Functional Success @1
Security Success @1
Functional Success Rate @1
Functional Success Rate @5
Functional Success Rate @10
Reprompt
Model=AR
2026.05
40
64.7
26
40
48.7
Vanilla
Model=AR
2026.05
37.3
55.3
21.3
30.7
38
Sec prompt
Model=Diffusion
2026.05
34.7
58.7
16
22.7
26.7
Vanilla
Model=Diffusion
2026.05
32
54.7
14.7
22
24
CDC
Model=Diffusion
2026.05
30.7
80.7
24.7
40.7
45.3
Feedback
Search any
task
Search any
task