Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Harmful Benchmarks

Benchmarks

Task NameDataset NameSOTA ResultTrend
Safety EvaluationHarmful Benchmarks (CATQA, HEX-PHI, Salad-Base)
CATQA Score99.94
24
Showing 1 of 1 rows