Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Harmlessness

Benchmarks

Task NameDataset NameSOTA ResultTrend
LLM AlignmentHarmlessness
WR87.85
7
Harmlessness EvaluationHarmlessness (evaluation set)
Win Rate48.76
5
Harmlessness evaluationHarmlessness
Disc. Score0.5409
5
Showing 3 of 3 rows