| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Machine-Generated Text Detection | MAGE | TP @ 20%85.12 | 18 | |
| Detection of Machine-Generated Text | MAGE Main Experimental Supplement (test) | TP@20%85.12 | 14 | |
| Detection of Machine-Generated Text | MAGE (test) | TP @ 20% Threshold85.12 | 14 | |
| Machine-Generated Text Detection | MAGE COLING2025 (val) | AUC79.55 | 13 | |
| Machine-generated text detection | MAGE Unseen Domains & Unseen Model (test) | Human Recall95.65 | 9 | |
| AI-generated text detection | MAGE BigScience 1.0 (test) | Accuracy96.7 | 8 | |
| AI-generated text detection | MAGE GLM 1.0 (test) | Accuracy94.1 | 8 | |
| AI-generated text detection | MAGE OPT 1.0 (test) | Accuracy89.1 | 8 | |
| AI-generated text detection | MAGE (LLaMA) 1.0 (test) | Accuracy88 | 8 | |
| AI-generated text detection | MAGE GPT 1.0 (test) | Accuracy82.7 | 8 | |
| AI-generated text detection | MAGE FLAN-T5 1.0 (test) | Accuracy68.9 | 8 | |
| Detection of LLM-generated text | MAGE Topic-based 3.5-turbo | Detection Accuracy100 | 8 | |
| Detection of LLM-generated text | MAGE News Topic-based 3.5-turbo | Detection Performance99.95 | 8 | |
| LLM-generated text detection | MAGE QA short text (<= 30 words) | AUROC0.9747 | 8 | |
| LLM-generated text detection | MAGE News short text (<= 30 words) | AUROC93.48 | 8 | |
| Detection of LLM generated text | MAGE QA | ROC AUC (FPR=1%)65.33 | 8 | |
| Detection of LLM generated text | MAGE News | ROC AUC @ FPR=1%0.6577 | 8 | |
| LLM-generated text detection | MAGE DIPPER attack | Human Score77.44 | 8 | |
| Detection Evasion | MAGE | TPR@1% (R)22.5 | 6 | |
| Machine-generated text detection | MAGE Arbitrary-domains & Arbitrary-models (test) | Human Recall0.9572 | 5 | |
| Machine-generated text detection | MAGE Paraphrasing Attack (test) | Human Recall79.66 | 4 | |
| AI-Generated Text Detection | MAGE DeepSeek-R1 OOD | Accuracy71 | 3 | |
| AI-Generated Text Detection | MAGE Claude-sonnet-4-5 OOD | Accuracy57.1 | 3 | |
| AI-Generated Text Detection | MAGE GPT-5 OOD | Accuracy68.7 | 3 | |
| AI-Generated Text Detection | MAGE GPT-4 OOD | Accuracy61.8 | 3 |