Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

SORRY-Bench

Benchmarks

Task NameDataset NameSOTA ResultTrend
Safety evaluationSorry-Bench
Safety Score99.09
90
Safety AlignmentSORRY-Bench
ASR10.22
40
Safety EvaluationSorry-Bench base
Safety Score92.73
27
Harmful Request DefenseSORRY-Bench
ASR13
24
Refusal ControlSORRY-Bench
Refusal Rate70.45
7
Showing 5 of 5 rows