Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Jailbreak Robustness on WildTeaming WJ
Loading...
95.1
Evaluation Score (avg@4)
Self-ReSET
46.22
58.91
71.6
84.29
May 9, 2026
Evaluation Score (avg@4)
Updated 22d ago
Evaluation Results
Method
Method
Links
Evaluation Score (avg@4)
Self-ReSET
Base Model=Qwen3-8B
2026.05
95.1
Self-ReSET
Base Model=DeepSeek-R1...
2026.05
94.6
RECAP
Base Model=DeepSeek-R1...
2026.05
93
Self-ReSET
Base Model=DeepSeek-R1...
2026.05
91.3
DAPO
Base Model=DeepSeek-R1...
2026.05
87.8
STAR-1
Base Model=DeepSeek-R1...
2026.05
84
DAPO
Base Model=Qwen3-8B
2026.05
82.8
RECAP
Base Model=Qwen3-8B
2026.05
80.2
STAR-1
Base Model=DeepSeek-R1...
2026.05
75
STAR-1
Base Model=Qwen3-8B
2026.05
75
RECAP
Base Model=DeepSeek-R1...
2026.05
74.9
DAPO
Base Model=DeepSeek-R1...
2026.05
70.5
Safechain
Base Model=Qwen3-8B
2026.05
66.8
Safechain
Base Model=DeepSeek-R1...
2026.05
64
Safechain
Base Model=DeepSeek-R1...
2026.05
59.6
Base
Base Model=Qwen3-8B
2026.05
56
Base
Base Model=DeepSeek-R1...
2026.05
48.8
Base
Base Model=DeepSeek-R1...
2026.05
48.1
Feedback
Search any
task
Search any
task