Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Attack Success Rate on 20,000 harmful requests and 20,000 jailbreak prompts (test)
Loading...
80.2
Attack Success Rate (ASR)
GateBreaker
3.448
23.374
43.3
63.226
Dec 24, 2025
Attack Success Rate (ASR)
Updated 1mo ago
Evaluation Results
Method
Method
Links
Attack Success Rate (ASR)
GateBreaker
Target Model=GPT-OSS-20B
2025.12
80.2
GateBreaker
Target Model=Hunyuan-A...
2025.12
76.9
GateBreaker
Target Model=OpenPangu...
2025.12
73.1
GateBreaker
Target Model=Average
2025.12
64.9
GateBreaker
Target Model=Mixtral-8...
2025.12
64.2
GateBreaker
Target Model=Qwen1.5-M...
2025.12
57.8
GateBreaker
Target Model=Qwen3-30B...
2025.12
56.9
GateBreaker
Target Model=Phi-3.5-M...
2025.12
56.5
GateBreaker
Target Model=Deepseek-...
2025.12
53.6
SAFEx
Target Model=Mixtral-8...
2025.12
48.8
SAFEx
Target Model=Deepseek-...
2025.12
35.6
SAFEx
Target Model=Qwen1.5-M...
2025.12
35
SAFEx
Target Model=OpenPangu...
2025.12
30
SAFEx
Target Model=Average
2025.12
29.9
SAFEx
Target Model=Qwen3-30B...
2025.12
28.4
SAFEx
Target Model=Hunyuan-A...
2025.12
28.4
SAFEx
Target Model=Phi-3.5-M...
2025.12
26.8
SAFEx
Target Model=GPT-OSS-20B
2025.12
6.4
Feedback
Search any
task
Search any
task