| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| AI-generated text detection | Writing Generated by Claude3 (test) | AUROC99.5 | 15 | |
| AI-generated text detection | Writing Generated by GPT-4 (test) | AUROC0.9768 | 15 | |
| AI-generated text detection | Writing Generated by ChatGPT (test) | AUROC0.9916 | 15 | |
| Idea Generation | Writing | Ideas Accepted1,000 | 3 | |
| Downstream classification | Writing Unconstrained | F1 Score22.1 | 3 | |
| Downstream classification | Writing Category-controlled top-K | F1 Score14.2 | 3 |