Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Sub-task Completion on AI-Pentest-Benchmark Single Experiment

46AC Score

Qwen3-32B-finetune (Ours)

14.822.93139.1Sep 16, 2025
Updated 1mo ago

Evaluation Results

MethodLinks
2025.09
46381538114251
2025.09
3130111855145
2025.09
2628112279166
2025.09
2524121549125
2025.09
212691829103
2025.09
20186122884
2025.09
162210172994