Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

SafetyBench

Benchmarks

Task NameDataset NameSOTA ResultTrend
Safety EvaluationSafetyBench en
Avg Score81.2
25
Safety EvaluationSafetyBench zh
Avg Score83.2
21
Safety EvaluationSafetyBench (test)
Accuracy81.321
9
Jailbreak AttackSafetyBench LLaVA-2 Integrated from AdvBench (test)
Illegal Activity Success Rate83.73
4
Jailbreak AttackSafetyBench MiniGPT-4 Integrated from AdvBench (test)
IA (Illegal Activity)0.7024
4
Showing 5 of 5 rows