Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Multi-label Toxic Content Classification on Jigsaw-ML (Adversarial Robustness)
Loading...
71.7
Attack Success Rate
AT1-unk
70.618
77.9215
85.225
92.5285
Apr 9, 2024
Attack Success Rate
Avg Queries
Avg Perturbed Words
Updated 4d ago
Evaluation Results
Method
Method
Links
Attack Success Rate
Avg Queries
Avg Perturbed Words
AT1-unk
Training=AT1-unk, Sear...
2024.04
71.7
91.08
11.84
No AT
Training=No AT, Search...
2024.04
98.75
49.38
6.96
Feedback
Search any
task
Search any
task