Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Secure Code Generation on COBALT Security Prompts 500 prompts per model
Loading...
48.4
Vulnerability Rate
Gemini 2.5 Flash
47.84
51.62
55.4
59.18
Apr 7, 2026
Vulnerability Rate
Critical Vulnerability Count
High Severity Vulnerability Count
Z3 Score
Grade Score
Updated 11d ago
Evaluation Results
Method
Method
Links
Vulnerability Rate
Critical Vulnerability Count
High Severity Vulnerability Count
Z3 Score
Grade Score
Gemini 2.5 Flash
Temperature=0, API Ide...
2026.04
48.4
146
86
142
-
Claude Haiku 4.5
Temperature=0, API Ide...
2026.04
49.2
155
81
152
-
GPT-4.1
Temperature=0, API Ide...
2026.04
54
142
86
136
-
Mistral Large
Temperature=0, API Ide...
2026.04
57.8
155
94
155
-
Llama 3.3 70B
Temperature=0, API Ide...
2026.04
58.4
168
83
147
-
Llama 4 Scout
Temperature=0, API Ide...
2026.04
60.6
167
95
156
-
GPT-4o
Temperature=0, API Ide...
2026.04
62.4
166
106
167
-
Feedback
Search any
task
Search any
task