| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Hallucination Detection | Pool (test) | AuROC0.8598 | 12 | |
| Model Routing | Small Pool | Oracle Accuracy92 | 6 | |
| Model Routing | Small pool | Mean per-model AUC82.6 | 6 | |
| Content Localization | Pool HumanEdit and AiEdit average | Accuracy98.46 | 5 | |
| Speech Editing Detection | Pool HumanEdit and AiEdit average | Acc98.46 | 5 | |
| Trajectory-controlled video generation | Pool | Interaction Realism4.4 | 2 |