| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Music Generation | Subjective Evaluation | Overall MOS76.08 | 5 | |
| Text-to-Video Generation | Subjective Evaluation 10 text prompts (test) | User Preference Score3.69 | 4 | |
| Subjective Evaluation | Subjective Evaluation Mandarin Accent | Perceived Accuracy70.77 | 1 | |
| Subjective Evaluation | Subjective Evaluation Hindi Accent | Accuracy78.46 | 1 | |
| Subjective Evaluation | Subjective Evaluation Spanish Accent | Accuracy (%)53.85 | 1 | |
| Subjective Evaluation | Subjective Evaluation French Accent | Accuracy66.15 | 1 | |
| Subjective Evaluation | Subjective Evaluation German Accent | Accuracy53.85 | 1 | |
| Subjective Evaluation | Subjective Evaluation British (England) Accent | Accuracy78.46 | 1 | |
| Subjective Evaluation | Subjective Evaluation US Accent | Perceived Accuracy80 | 1 | |
| Talking Head Generation | Subjective Evaluation Talking Head Videos (test) | Metric- | 0 |