Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

SkillVetBench

Benchmarks

Task NameDataset NameSOTA ResultTrend
Vulnerability DetectionSkillVetBench
Malicious Verdict Count0
9
Vulnerability DetectionSkillVetBench Privilege Abuse
Malicious Verdict Count0
9
Vulnerability DetectionSkillVetBench Supply Chain
Malicious Verdict Count0
9
Vulnerability DetectionSkillVetBench Data Exposure
Malicious Verdict Count0
9
Vulnerability DetectionSkillVetBench Unsafe File Ops
Malicious Verdict Count0
9
Vulnerability DetectionSkillVetBench Prompt Injection
Malicious Verdict Count0
9
Vulnerability DetectionSkillVetBench Command Injection
Malicious Verdict Count5
9
Vulnerability DetectionSkillVetBench Memory Poisoning
Malicious Verdict Count1
3
Showing 8 of 8 rows