Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Refusal Evaluation on WildGuard Harmful
Loading...
84.35
Refusal Rate
Low-Rank Combination
58.0068
64.8459
71.685
78.5241
Mar 9, 2026
Refusal Rate
Updated 1mo ago
Evaluation Results
Method
Method
Links
Refusal Rate
Low-Rank Combination
Backbone=DEEPSEEK R1 D...
2026.03
84.35
DEEPSEEK R1 DISTILL LLAMA
Backbone=DEEPSEEK R1 D...
2026.03
77.72
Categorical Steering
Backbone=REFUSE-LLAMA,...
2026.03
77.19
Low-Rank Combination
Backbone=LLAMA 3 8B IN...
2026.03
74.27
Low-Rank Combination
Backbone=REFUSE-LLAMA,...
2026.03
73.87
LLAMA 3 8B INSTRUCT
Backbone=LLAMA 3 8B IN...
2026.03
73.74
REFUSE-LLAMA
Backbone=REFUSE-LLAMA
2026.03
59.02
Feedback
Search any
task
Search any
task