Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Pentesting Explanation Generation on Pentesting Scenarios (test)
Loading...
72
Explanation Score
Claude 4.5 Sonnet
43.92
51.21
58.5
65.79
May 6, 2026
Explanation Score
Updated 27d ago
Evaluation Results
Method
Method
Links
Explanation Score
Claude 4.5 Sonnet
2026.05
72
Qwen-3-14B-GRPO
Parameters=14B, Optimi...
2026.05
71
GPT 4.1
2026.05
66
GPT-4o-mini
2026.05
64
GPT-5
2026.05
63
Claude 3 Haiku
2026.05
58
Gemini 2.5 Flash
2026.05
56
Gemini 2.0 Flash
2026.05
53
LLaMA-3.1-8B
Parameters=8B
2026.05
51
GPT-3.5-turbo
2026.05
47
Qwen-3-14B
Parameters=14B
2026.05
45
Feedback
Search any
task
Search any
task