| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Machine-Generated Text Detection | MAGE | TP @ 20%85.12 | 18 | |
| Detection of Machine-Generated Text | MAGE Main Experimental Supplement (test) | TP@20%85.12 | 14 | |
| Detection of Machine-Generated Text | MAGE (test) | TP @ 20% Threshold85.12 | 14 | |
| Machine-Generated Text Detection | MAGE COLING2025 (val) | AUC79.55 | 13 | |
| Machine-generated text detection | MAGE Unseen Domains & Unseen Model (test) | Human Recall95.65 | 9 | |
| Detection of LLM-generated text | MAGE Topic-based 3.5-turbo | Detection Accuracy100 | 8 | |
| Detection of LLM-generated text | MAGE News Topic-based 3.5-turbo | Detection Performance99.95 | 8 | |
| LLM-generated text detection | MAGE QA short text (<= 30 words) | AUROC0.9747 | 8 | |
| LLM-generated text detection | MAGE News short text (<= 30 words) | AUROC93.48 | 8 | |
| Detection of LLM generated text | MAGE QA | ROC AUC (FPR=1%)65.33 | 8 | |
| Detection of LLM generated text | MAGE News | ROC AUC @ FPR=1%0.6577 | 8 | |
| LLM-generated text detection | MAGE DIPPER attack | Human Score77.44 | 8 | |
| Detection Evasion | MAGE | TPR@1% (R)22.5 | 6 | |
| Machine-generated text detection | MAGE Arbitrary-domains & Arbitrary-models (test) | Human Recall0.9572 | 5 | |
| Machine-generated text detection | MAGE Paraphrasing Attack (test) | Human Recall79.66 | 4 |