Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Adaptive Jailbreak Attack Success Rate on PAIR behaviors held-out (test)
Loading...
100
ASR
Qwen3-8B
-4
23
50
77
May 20, 2026
ASR
Updated 12d ago
Evaluation Results
Method
Method
Links
ASR
Qwen3-8B
Defense=Baseline, Samp...
2026.05
100
Qwen3-4B-Instruct-2507
Defense=Baseline, Samp...
2026.05
95
gpt-oss-20b
Defense=Baseline, Samp...
2026.05
94
Qwen3-4B-Instruct-2507
Defense=BCT, Sampling...
2026.05
16.9
Qwen3-8B
Defense=BCT, Sampling...
2026.05
12
Qwen3-8B
Defense=OPCT, Sampling...
2026.05
1.2
Qwen3-4B-Instruct-2507
Defense=OPCT, Sampling...
2026.05
0
gpt-oss-20b
Defense=BCT, Sampling...
2026.05
0
gpt-oss-20b
Defense=OPCT, Sampling...
2026.05
0
Feedback
Search any
task
Search any
task