Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Adversarial Attack Detection on Gandalf

1Recall

Llama Prompt Guard 2

-0.040.230.50.77Dec 23, 2025
Updated 4d ago

Evaluation Results

MethodLinks
2025.12
1
2025.12
0.91
2025.12
0.91
2025.12
0.7
2025.12
0.69
2025.12
0.63
2025.12
0.52
2025.12
0.47
2025.12
0.44
2025.12
0.41
2025.12
0.27
2025.12
0.26
2025.12
0.23
2025.12
0.02
2025.12
0