Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Adversarial Detection on GCG
Loading...
100
DSR
PPL Filter
14.408
36.629
58.85
81.071
May 2, 2026
DSR
Updated 28d ago
Evaluation Results
Method
Method
Links
DSR
PPL Filter
Target Model=Qwen2.5-7...
2026.05
100
PPL Filter
Target Model=Mistral-7...
2026.05
100
PPL Filter
Target Model=Llama-3.1...
2026.05
100
SALO
Target Model=Mistral-7...
2026.05
96.5
Smooth LLM
Target Model=Qwen2.5-7...
2026.05
94.6
No Defense (1-ASR)
Target Model=Llama-3.1...
2026.05
88.3
SALO
Target Model=Qwen2.5-7...
2026.05
88.3
SALO
Target Model=Llama-3.1...
2026.05
85.4
Linear Probe
Target Model=Qwen2.5-7...
2026.05
82.9
Smooth LLM
Target Model=Llama-3.1...
2026.05
80.6
GradSafe
Target Model=Llama-3.1...
2026.05
76.4
No Defense (1-ASR)
Target Model=Qwen2.5-7...
2026.05
71.2
Linear Probe
Target Model=Mistral-7...
2026.05
71
Linear Probe
Target Model=Llama-3.1...
2026.05
64.2
Smooth LLM
Target Model=Mistral-7...
2026.05
55.8
No Defense (1-ASR)
Target Model=Mistral-7...
2026.05
44.6
GradSafe
Target Model=Mistral-7...
2026.05
36.7
GradSafe
Target Model=Qwen2.5-7...
2026.05
17.7
Feedback
Search any
task
Search any
task