| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Watermark Detection | longform_qa | Accuracy100 | 48 | |
| Detection Accuracy | LongForm QA | Accuracy99.88 | 24 | |
| Factuality Evaluation | LongForm | Precision33.4 | 6 | |
| Detection | LongForm | Score (gpt-5.1)100 | 5 | |
| Prevention | LongForm | Score (gpt-5.1)100 | 5 |