| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| CWEval | pass@142.17 | 24 | 1mo ago | ||
| CWEval | Functionality92.27 | 22 | 1mo ago | ||
| CodeGuard+ | Hybrid (CodeGuard + SCS) | Pass@185.93 | 18 | 1mo ago | |
| CyberSecEval SCG | SafeCoder | Safety79.06 | 17 | 1mo ago | |
| Secure Code Average | SecCoderX | Safety Score55.36 | 12 | 1mo ago | |
| SecHolmesEval | P10 Hybrid Pipeline | Insecure Generation Rate1.9 | 8 | 25d ago | |
| SecLLMEval | Insecure Generation Rate2.7 | 8 | 25d ago | ||
| Secure Code Generation Scenarios 1.0 (test) | gemini-2.5-pro (Reflex) | Security Success Rate0.971 | 8 | 1mo ago | |
| Secure Code generation | BEAVER | RDR42 | 8 | 1mo ago | |
| CVS (test) | Llama3-70b-instruct | C++ Success Rate98 | 8 | 1mo ago | |
| COBALT Security Prompts 500 prompts per model | Vulnerability Rate48.4 | 7 | 11d ago |