| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| AgentHazard full | AgentHazard (full) | Goal-text Entropy0.93 | 1 | 6d ago | |
| AgentHazard 3-app | AgentHazard (Liu et al., 2025) (3-app) | Goal-text Entropy0.976 | 1 | 6d ago | |
| MIRAGE 3-app overlap | MIRAGE (3-app overlap; 30 unique base) | Goal-Text Entropy0.918 | 1 | 6d ago | |
| GhostEI-Bench | GhostEI-Bench (Chen et al., 2025) | Goal-text Entropy0.979 | 1 | 6d ago | |
| MIRAGE matched-n | MIRAGE matched-n (vs. GhostEI) | Goal-text Entropy0.927 | 1 | 6d ago | |
| MIRAGE full | MIRAGE (full; 96 unique base) | Goal-text Entropy0.933 | 1 | 6d ago |