Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Safety Benchmarks

Benchmarks

Task NameDataset NameSOTA ResultTrend
Jailbreak Attack EvaluationFive Safety Benchmarks AdvBench, HarmBench, HarmfulQ, JBBench, StrongReject
ASR7.69
6
Safety EvaluationSafety Benchmarks Overall
Cost per Accuracy Point ($)0.001
4
Safety EvaluationSafety Benchmarks Aggregate (test)
Generation Quality (Std Prefix)73.6
4
Safety EvaluationFive Safety Benchmarks direct_q
ASR0.02
3
Showing 4 of 4 rows