Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Safety Classification on Safety Evaluation Scenarios Hate Speech

99.9Safe Classification Rate

Llama-2-7B-Chat

41.03656.31871.686.882Mar 10, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.03
99.90.1
2026.03
43.356.7