Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Multi-label Toxic Content Classification on Jigsaw-ML (Adversarial Robustness)
Loading...
71.7
Attack Success Rate
AT1-unk
70.618
77.9215
85.225
92.5285
Apr 9, 2024
Attack Success Rate
Avg Queries
Avg Perturbed Words
Updated 1mo ago
Evaluation Results
Method
Method
Links
Attack Success Rate
Avg Queries
Avg Perturbed Words
AT1-unk
Training=AT1-unk, Sear...
2024.04
71.7
91.08
11.84
No AT
Training=No AT, Search...
2024.04
98.75
49.38
6.96
Feedback
Search any
task
Search any
task