Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

DANGEROUSQA

Benchmarks

Task NameDataset NameSOTA ResultTrend
Red-Teaming (Attack Success Rate)DANGEROUSQA
ASR0
30
Safety EvaluationDANGEROUSQA Llama-2 base
Chinese Safety Score15.3
8
Jailbreak AttackDangerousQA
Harmful Rate1.01
6
Safety EvaluationDangerousQA (test)
Harmful Rate0.0122
3
Showing 4 of 4 rows