| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Summarization | WaterBench (test) | GM22.03 | 11 | |
| Reasoning & Coding | WaterBench (test) | GM59.82 | 11 | |
| Long-form QA | WaterBench (test) | GM Score24.06 | 11 | |
| Diffusion Language Model Watermarking | WaterBench 600 prompts 2024 | PPL2.8 | 9 | |
| Text Generation Quality Evaluation | WaterBench 1000 prompts | PPL9.878 | 6 | |
| Watermarking Detection | WaterBench 1000 prompts | Completeness98.3 | 5 |