Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Vulnerability Detection on Vulnerability evaluation dataset 1.0 (test)
Loading...
97.2
Opus
D2 Dual-Pass
93.872
94.736
95.6
96.464
Feb 18, 2026
Opus
Overall DR
DeepSeek
GPT-5.2
Perplexity
Recovery
Updated 4d ago
Evaluation Results
Method
Method
Links
Opus
Overall DR
DeepSeek
GPT-5.2
Perplexity
Recovery
D2 Dual-Pass
Defense=Dual-Pass Anal...
2026.02
97.2
96.1
94.5
97.2
95.5
35
D5 SAST Cross-Ref
Defense=SAST Cross-Ref...
2026.02
97.2
96.9
96.6
97.2
96.6
47
D1 Comment Strip
Defense=Comment Stripping
2026.02
96.9
93.2
89
95.9
91
30
D6 Comment Anomaly
Defense=Comment Anomal...
2026.02
96.9
95.1
92.8
96.9
93.8
0
Baseline (C4+SP0)
Defense=Baseline
2026.02
94
-
93
91.9
92
-
Feedback
Search any
task
Search any
task