| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Music Generation | Subjective Evaluation | Overall MOS76.08 | 5 | |
| Text-to-Video Generation | Subjective Evaluation 10 text prompts (test) | User Preference Score3.69 | 4 | |
| Talking Head Generation | Subjective Evaluation Talking Head Videos (test) | Metric- | 0 |