Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Refusal Evaluation on Do-Not-Answer
Loading...
95.21
Refusal Rate
Low-Rank Combination
48.6908
60.7679
72.845
84.9221
Mar 9, 2026
Refusal Rate
Updated 1mo ago
Evaluation Results
Method
Method
Links
Refusal Rate
Low-Rank Combination
Backbone=REFUSE-LLAMA,...
2026.03
95.21
Categorical Steering
Backbone=REFUSE-LLAMA,...
2026.03
92.55
REFUSE-LLAMA
Backbone=REFUSE-LLAMA
2026.03
87.01
Low-Rank Combination
Backbone=DEEPSEEK R1 D...
2026.03
71.67
DEEPSEEK R1 DISTILL LLAMA
Backbone=DEEPSEEK R1 D...
2026.03
69.86
LLAMA 3 8B INSTRUCT
Backbone=LLAMA 3 8B IN...
2026.03
54.85
Low-Rank Combination
Backbone=LLAMA 3 8B IN...
2026.03
50.48
Feedback
Search any
task
Search any
task