Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Adversarial Attack Defense on Held-out attacks (test)

7.8ASR (Multi-turn Manip.)

SecureCAI

6.0218.03530.0542.065Jan 12, 2026
Updated 4d ago

Evaluation Results

MethodLinks
2026.01
7.86.28.49.17.9
2026.01
52.344.748.156.850.5