Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Black-box Vulnerability Detection on Mercury
Loading...
22
Vulnerabilities Found Count
P3: Methodology-Guided
0.16
5.83
11.5
17.17
May 22, 2026
Vulnerabilities Found Count
Updated 9d ago
Evaluation Results
Method
Method
Links
Vulnerabilities Found Count
P3: Methodology-Guided
Paradigm=P3, Tooling=E...
2026.05
22
P4: ARG (Det.-Hybrid)
Paradigm=P4, Tooling=N...
2026.05
18
P2: Tool-Augmented
Paradigm=P2, Tooling=E...
2026.05
4
P1: Direct Prompting
Paradigm=P1, Tooling=N...
2026.05
1
Feedback
Search any
task
Search any
task