| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Spoken Dialogue | MultiDialog 1.0 (test) | PPL930.401 | 8 | |
| Audio-visual generation | MultiDialog (test) | SIM0.624 | 4 | |
| Fine-grained Score Accuracy | MultiDialog | Exact Accuracy65.62 | 1 | |
| Binary classification (Human vs Machine speech) | MultiDialog (Human-Human) OOD (test) | Accuracy95.31 | 1 |