Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Jailbreak Robustness on AutoRAN
Loading...
0
Harmfulness Rate
RealSafe-R1
-0.24
1.38
3
4.62
Aug 6, 2025
Harmfulness Rate
Updated 27d ago
Evaluation Results
Method
Method
Links
Harmfulness Rate
RealSafe-R1
Model=R1-Qwen-14B
2025.08
0
SafeKey
Model=R1-Qwen-14B
2025.08
0
ThinkingI
Model=R1-Qwen-14B
2025.08
0
SAFEPATH-ZS
Model=R1-Qwen-14B
2025.08
0
ReasoningGuard
Model=R1-Qwen-14B
2025.08
0
No Defense
Model=R1-Qwen-32B
2025.08
0
RealSafe-R1
Model=R1-Qwen-32B
2025.08
0
SAFEPATH-ZS
Model=R1-Qwen-32B
2025.08
0
ReasoningGuard
Model=R1-Qwen-32B
2025.08
0
No Defense
Model=R1-Qwen-14B
2025.08
2
Self-Reminder
Model=R1-Qwen-14B
2025.08
2
SmoothLLM
Model=R1-Qwen-14B
2025.08
2
Paraphrase
Model=R1-Qwen-32B
2025.08
2
Self-Reminder
Model=R1-Qwen-32B
2025.08
4
SmoothLLM
Model=R1-Qwen-32B
2025.08
4
Paraphrase
Model=R1-Qwen-14B
2025.08
6
ThinkingI
Model=R1-Qwen-32B
2025.08
6
Feedback
Search any
task
Search any
task