Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Jailbreak Detection on AutoDAN (ASR/PGR)
Loading...
0
Attack Success Rate (ASR)
GradientCuff
-0.08
0.46
1
1.54
Apr 1, 2026
Attack Success Rate (ASR)
PGR
Updated 16d ago
Evaluation Results
Method
Method
Links
Attack Success Rate (ASR)
PGR
GradientCuff
Latency (Sec.)=19.26,...
2026.04
0
1
Token Highlighter
Latency (Sec.)=15.20,...
2026.04
0
6
SelfDefend (Direct)
Latency (Sec.)=0.80, M...
2026.04
0
11
WildGuard (Pre)
Latency (Sec.)=2.74, M...
2026.04
0
2
GuardReasoner (Pre)
Latency (Sec.)=11.90,...
2026.04
0
1
GuardReasoner (Post)
Latency (Sec.)=13.14,...
2026.04
0
98
SelfGrader
Latency (Sec.)=0.78, M...
2026.04
0
1
GradSafe
2026.04
1
4
SelfDefend (Intent)
Latency (Sec.)=2.38, M...
2026.04
1
11
WildGuard (Post)
Latency (Sec.)=2.78, M...
2026.04
1
99
LLama-3-8B-Instruct (No Defense)
Latency (Sec.)=0.86
2026.04
2
-
Perplexity Filter
Latency (Sec.)=0.27, M...
2026.04
2
100
Prompt Guard
Latency (Sec.)=16.87,...
2026.04
2
42
Llama Guard (Pre)
Latency (Sec.)=0.62, M...
2026.04
2
50
Llama Guard (Post)
Latency (Sec.)=0.70, M...
2026.04
2
99
Feedback
Search any
task
Search any
task