Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Cybersecurity vulnerability remediation on CVE-Bench 1.0

90Pass@1 Rate

GPT-5.3-Codex system card

85.587.759092.25May 25, 2026
Updated 8d ago

Evaluation Results

MethodLinks
2026.05
90-