Share your thoughts, 1 month free Claude Pro on us
See more
Feedback
Search any
task
Search any
task
SOTA Safety Refusal benchmarks and papers with code | Wizwand
Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Tasks
Safety Refusal
Benchmarks
Dataset Name
SOTA Method
Dataset Name
SOTA Method
Metric
Trend
Results
Last Updated
AdvBench
Low-Rank Combination
Refusal Rate
99.42
46
1mo ago
Safety Evaluation Prompts
DirAbl
Refusal
62.5
40
1mo ago
Jailbreak Prompts
ITI
Refusal Rate
87.2
15
1mo ago
ToxicChat
WAS
Refusal Rate
95
15
1mo ago
Showing 4 of 4 rows
25 / page
50 / page
100 / page
1
Search any
task
Search any
task