| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| AI-generated text detection | Writing Generated by Claude3 (test) | AUROC99.5 | 15 | |
| AI-generated text detection | Writing Generated by GPT-4 (test) | AUROC0.9768 | 15 | |
| AI-generated text detection | Writing Generated by ChatGPT (test) | AUROC0.9916 | 15 | |
| Classification | Writing 10-shot | Accuracy91.3 | 10 | |
| Classification | Writing 5-shot | Accuracy87 | 10 | |
| Classification | Writing 3-shot | Accuracy75.1 | 10 | |
| Human Sensing | Writing 5-shot | Training Time (mins)7.71 | 5 | |
| Human Sensing | Writing 10-shot | GPU Utilization (%)92.22 | 5 | |
| Human Sensing | Writing 5-shot | GPU Utilization85.47 | 5 | |
| Human Sensing | Writing 3-shot | GPU Utilization75.14 | 5 | |
| Human Sensing | Writing | Watch Latency (ms)467.5 | 4 | |
| Idea Generation | Writing | Ideas Accepted1,000 | 3 | |
| Downstream classification | Writing Unconstrained | F1 Score22.1 | 3 | |
| Downstream classification | Writing Category-controlled top-K | F1 Score14.2 | 3 |