Share your thoughts, 1 month free Claude Pro on us
See more
Feedback
Search any
task
Search any
task
SOTA Red Teaming benchmarks and papers with code | Wizwand
Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Tasks
Red Teaming
Benchmarks
Dataset Name
SOTA Method
Dataset Name
SOTA Method
Metric
Trend
Results
Last Updated
HarmBench
AutoDAN-Turbo
ASR
96.3
244
1mo ago
Violence prompts
UnlearnDiffAtk
Failure Rate (FR)
0
48
21d ago
I2P Nudity prompts
Ring-A-Bell
Failure Rate (FR)
0
48
21d ago
50 harmful goals (Manual evaluation)
PAIR
Hard ASR
100
30
3mo ago
CatQA
SafeTransformer
ASR
0
20
2mo ago
AdversarialQA
SafeTransformer
ASR
0
20
2mo ago
Religious Discrimination principle v1 (test)
QCI
Mean Best Category Score
5.32
12
3mo ago
Illegal Activity principle v1 (test)
RS
Mean Score (Best Category)
-2.73
12
3mo ago
AI Supremacy principle v1 (test)
CRL
Mean Best Category Score
11.7
12
3mo ago
AdvBench (test)
AMIS
ASR
88
8
2mo ago
DailyDialog against DialoGPT-large
BRT (e+r)
RSR
40
8
3mo ago
DailyDialog against BB-3B
BRT (e+r)
RSR
40.2
8
3mo ago
ConvAI2 (filtered hard positive)
BRT (e+r)
RSR
2,120
7
3mo ago
Bloom ZS (filtered hard positive)
BRT (e+r)
RSR
15.6
7
3mo ago
BAD Against Friend Chat (test)
BRT (e)
RSR
64.2
7
3mo ago
BAD Against Marv (test)
BRT (s+r)
RSR
88.1
7
3mo ago
GPT-OSS 20B
PAIR
Coverage
63.2
5
26d ago
Llama-3-8B
Ours (ME)
Coverage
63.04
5
26d ago
Web-Augmented LLM Red-Teaming Evaluation Set
CREST-Search
Detection Rate
80.5
5
1mo ago
Korean red teaming dataset (test)
Exaone-3.5-2.4B-inst
Attack Success Rate
0.5797
5
3mo ago
HarmBench Claude-Sonnet-3.5 (held-out test)
AGENTICRED
ASR
60
5
3mo ago
HarmBench Llama-3-8B (test)
AGENTICRED
ASR
0.98
5
3mo ago
HarmBench Llama-2-7B (test)
AutoDAN-Turbo
ASR
36
5
3mo ago
GPT-5 Mini
Ours (ME)
Coverage
72.32
4
26d ago
KT RAIC proprietary Korean red-teaming dataset
EXAONE-4.0-32B
Attack Success Rate
54
4
2mo ago
Showing 25 of 35 rows
25 / page
50 / page
100 / page
1
2
Search any
task
Search any
task
Privacy Policy
Terms of Service
FAQs
Swarm Docs