Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

PolyRefuse

Benchmarks

Task NameDataset NameSOTA ResultTrend
Safety Refusal EvaluationPolyRefuse yo 1.0
Harmful Refusal Rate17.7
21
Safety Refusal EvaluationPolyRefuse 1.0 (si)
Harmful Refusal Rate90.2
21
Safety Refusal EvaluationPolyRefuse 1.0 (km)
Harmful Refusal Rate90.9
21
Safety Refusal EvaluationPolyRefuse 1.0 (my)
Harmful Refusal Rate91.3
21
Safety Refusal EvaluationPolyRefuse am 1.0
Harmful Refusal Rate93.2
21
Safety Refusal EvaluationPolyRefuse 1.0 (sw)
Harmful Refusal Rate96.7
21
Showing 6 of 6 rows