Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Safety Classifier Evasion on SND (Secret-Note Diffusion) filter-evasion set
Loading...
100
∂ins Success Rate
LlamaGuard3 (F1)
85.128
88.989
92.85
96.711
May 12, 2026
∂ins Success Rate
∂ret Q1 Success Rate
∂ret Q2 Success Rate
∂ret Q3 Success Rate
∂comp Success Rate
Overall Success Rate
Updated 9d ago
Evaluation Results
Method
Method
Links
∂ins Success Rate
∂ret Q1 Success Rate
∂ret Q2 Success Rate
∂ret Q3 Success Rate
∂comp Success Rate
Overall Success Rate
LlamaGuard3 (F1)
Developer=Meta, Traini...
2026.05
100
100
100
100
80
95.9
AprielGuard (F3)
Developer=ServiceNow A...
2026.05
100
100
100
100
100
100
Granite Guardian (F4)
Developer=IBM, Trainin...
2026.05
100
100
100
100
100
100
WildGuard (F2)
Developer=AllenAI, Tra...
2026.05
85.7
100
100
95.5
70
90
Feedback
Search any
task
Search any
task