| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Watermark Robustness Analysis | Gemma-2 2B | Post-attack TPR100 | 49 | |
| Semantic similarity analysis | Gemma-2 9B within-prompt completions | Cosine Distance0.312 | 8 | |
| Semantic similarity analysis | Gemma-2 within-prompt completions 2B | Cosine Distance0.316 | 8 | |
| Input-based feature description evaluation | Gemma-2 MLP SAE features | Score56.6 | 8 | |
| Input-based feature description evaluation | Gemma-2 Residual SAE features | Feature Description Score67 | 8 | |
| Watermark Detection Robustness | Gemma-2 9B Pre-trained (PT) (test) | TPR (Baseline)100 | 7 | |
| Watermarked text generation and detection | Gemma-2 2B-IT | TPR99 | 1 |