| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Jailbreak Attack Transferability | Llama-3-8b-Instruct finetuned variants v1 (test) | TSR51.2 | 16 | |
| Matrix Multiplication Latency | Llama-3 70B | Kernel Latency (µs)293.82 | 8 | |
| Matrix Multiplication Latency | Llama-3 8B | Kernel-level latency (µs)152.69 | 8 | |
| Watermark Detection | Llama-3-8B Translate perturbation, 30 tokens 1.0 (test) | Mean P0.13 | 6 | |
| Watermark Detection Robustness | Llama-3-8B GPT-4o Paraphrase, 150 Tokens | Mean P0.26 | 6 | |
| Watermark Detection Robustness | Llama-3-8B GPT-4o Paraphrase, 30 Tokens | Mean P29 | 6 | |
| Watermark Detection Robustness | Llama-3-8B Swap 50%, 30 Tokens | Mean P25 | 6 |