Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Safety Evaluation on Jailbroken (test)
Loading...
1
ASR
SafeThinker
0.208
5.554
10.9
16.246
Jan 23, 2026
ASR
Updated 1mo ago
Evaluation Results
Method
Method
Links
ASR
SafeThinker
Backbone=Qwen2.5-14B-I...
2026.01
1
SafeDecoding
Backbone=Qwen2.5-14B-I...
2026.01
4
Self-Examination
Backbone=Qwen2.5-14B-I...
2026.01
7.6
Self-Reminder
Backbone=Qwen2.5-14B-I...
2026.01
12.8
ICD
Backbone=Qwen2.5-14B-I...
2026.01
14.2
PPL
Backbone=Qwen2.5-14B-I...
2026.01
19.8
No Defense
Backbone=Qwen2.5-14B-I...
2026.01
20.8
Feedback
Search any
task
Search any
task