Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Jailbreak Defense on HarmBench (test)
Loading...
0
ASR
BCT
-2.94
16.905
36.75
56.595
May 27, 2026
ASR
Updated 6d ago
Evaluation Results
Method
Method
Links
ASR
BCT
Base Model=GPT-OSS-20B
2026.05
0
ACT
Base Model=Qwen3-8B
2026.05
0
BCT
Base Model=Phi-4-reaso...
2026.05
0
ACT
Base Model=Phi-4-reaso...
2026.05
1
ACT
Base Model=Qwen3-1.7B
2026.05
2.4
ACT
Base Model=GPT-OSS-20B
2026.05
4.5
ACT
Base Model=Gemma-4-E4B-it
2026.05
11.2
Baseline
Base Model=GPT-OSS-20B
2026.05
12.2
BCT
Base Model=Gemma-4-E4B-it
2026.05
14.3
BCT
Base Model=Qwen3-8B
2026.05
18.4
Baseline
Base Model=Phi-4-reaso...
2026.05
30
BCT
Base Model=Qwen3-1.7B
2026.05
51
Baseline
Base Model=Qwen3-8B
2026.05
62.6
Baseline
Base Model=Qwen3-1.7B
2026.05
63.2
Baseline
Base Model=Gemma-4-E4B-it
2026.05
73.5
Feedback
Search any
task
Search any
task