| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Audio-driven half-body human video generation | EMTD 1.0 (evaluation set) | FID49.33 | 14 | |
| Talking avatar video generation | EMTD (test) | FID59.87 | 10 | |
| Head-Oriented Image-to-Video Generation | EMTD | IQA2.31 | 6 | |
| Audio-driven video generation | EMTD (test) | FID15.66 | 6 | |
| Talking Head Generation | EMTD | Sync-C8.61 | 4 | |
| Audio-to-video synthesis | EMTD | LSE-C7.05 | 3 |