Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Automated Penetration Testing on Vulhub
Loading...
100
Success Rate
CHECKMATE
-4
23
50
77
Dec 11, 2025
Dec 28, 2025
Jan 15, 2026
Feb 2, 2026
Feb 20, 2026
Mar 10, 2026
Mar 28, 2026
Success Rate
CV - Cost
CV - Time
Solved Count
Updated 2mo ago
Evaluation Results
Method
Method
Links
Success Rate
CV - Cost
CV - Time
Solved Count
CHECKMATE
2025.12
100
0.129
0.093
-
Claude Code
2025.12
75
0.451
0.325
-
Red-MIRROR
LLM=DeepSeek-V3.2, Tim...
2026.03
50
-
-
4
PentestAgent
LLM=DeepSeek-V3.2, Tim...
2026.03
50
-
-
4
AutoPT
LLM=DeepSeek-V3.2, Tim...
2026.03
37.5
-
-
3
VulnBot
LLM=DeepSeek-V3.2, Tim...
2026.03
0
-
-
0
Feedback
Search any
task
Search any
task