| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| PHEME New Attacks: ExplainDrive (test) | LLM-SGA/ARHOCD | Accuracy82.91 | 15 | 1mo ago | |
| PHEME Known Attacks: DeepWordBug, TFAdjusted, TREPAT (test) | LLM-SGA/ARHOCD | Accuracy85.59 | 10 | 1mo ago | |
| FoodGuardBench (test) | FoodGuard-4B | FNR2.75 | 7 | 16d ago | |
| Standard Harmful Content Datasets Evasion Attack | GAVEL | Phishing96 | 3 | 1mo ago | |
| Standard Harmful Content Datasets (Goal Hijacking Attack) | GAVEL | Phishing96 | 2 | 1mo ago | |
| Standard Harmful Content Datasets Misdirection Attack | GAVEL | Phishing97 | 2 | 1mo ago |