Share your thoughts, 1 month free Claude Pro on us
See more
Feedback
Search any
task
Search any
task
SOTA Prompt Harmfulness Classification benchmarks and papers with code | Wizwand
Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Tasks
Prompt Harmfulness Classification
Benchmarks
Dataset Name
SOTA Method
Dataset Name
SOTA Method
Metric
Trend
Results
Last Updated
Public Prompt Harmfulness Benchmarks (ToxicChat, OpenAI Moderation, AegisSafetyTest, SimpleSafetyTests, HarmBenchPrompt)
NemoGuard
OAI Score
81
26
23d ago
WILDGUARD (test)
COLAGUARD
F1 Score
89.44
18
5d ago
Showing 2 of 2 rows
25 / page
50 / page
100 / page
1
Search any
task
Search any
task