| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| 20 COM objects | C01100 | 12 | 27d ago | ||
| TerminalBench 2 snapshot 2026-04-17 | AgentFlow | Score (%)84.3 | 11 | 1mo ago | |
| 847 historical Linux kernel CVEs Replay Mode 2019-2025 | VCAO (full) | Time To First Vulnerability (hrs)3.2 | 7 | 1mo ago | |
| Total 10 Projects Aggregated | Sailor | Confirmed Vulnerabilities379 | 7 | 1mo ago | |
| mupdf | Sailor | Confirmed Vulnerabilities141 | 7 | 1mo ago | |
| SQLite | Confirmed Vulnerabilities1 | 7 | 1mo ago | ||
| SELinux | Sailor | Confirmed Vulnerabilities62 | 7 | 1mo ago | |
| FFmpeg | Sailor | Confirmed Vulnerabilities78 | 7 | 1mo ago | |
| OpenSSL | Confirmed Vulnerabilities5 | 7 | 1mo ago | ||
| curl | Confirmed Vulnerabilities0 | 7 | 1mo ago | ||
| binutils | Sailor | Confirmed Vulnerabilities52 | 7 | 1mo ago | |
| libpng | Sailor | Confirmed Vulnerabilities21 | 7 | 1mo ago | |
| libtiff | Sailor | Confirmed Vulnerabilities14 | 7 | 1mo ago | |
| libxml2 | Confirmed Vulnerabilities20 | 7 | 1mo ago | ||
| Meta-prompt-guided adversarial generation corpus GPT-OSS 20B v1.0 (dynamically generated) (Evaluation corpus) | Our Framework | RH10 | 5 | 3mo ago |