Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

DoNotAnswer

Benchmarks

Task NameDataset NameSOTA ResultTrend
Safety EvaluationDoNotAnswer Framed
HRR0
96
Toxicity and Harmful Content DetectionDoNotAnswer
Score99.98
5
Safety DetectionDoNotAnswer (held-out)
AUROC97.4
5
Showing 3 of 3 rows