| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Text-to-Video Generation | VBench | Quality Score86.73 | 168 | |
| Video generation | VBench | Quality Score86.67 | 126 | |
| Long Video Generation | VBench-Long 60 seconds | Subject Consistency98.6 | 74 | |
| Video Generation | VBench 5s | Quality Score86.97 | 73 | |
| Video Generation | VBench (test) | Semantic Score83.4 | 66 | |
| Video Generation | VBench 2.0 (test) | Total Score83.79 | 49 | |
| Video Generation | VBench Long | Motion Smoothness99.62 | 49 | |
| Video Generation | VBench | Total Score85.11 | 42 | |
| Video Generation | VBench | Motion Smoothness99.4 | 37 | |
| Text-to-Video Generation | VBench (test) | Total Score84.26 | 37 | |
| Long Video Generation | VBench | Overall Score98.5 | 35 | |
| Image-to-Video Generation | VBench | Motion Smoothness0.994 | 28 | |
| text-to-video generation | VBench | Latency (s)82.9 | 26 | |
| Video Generation | VBench 2.0 | Human Fidelity0.951 | 26 | |
| Text-to-Video Generation | VBench T2V | Overall Score84.12 | 25 | |
| Image-to-Video Generation | VBench I2V | Background Consistency99.08 | 24 | |
| Video Generation | VBench | Motion Smoothness99.28 | 23 | |
| Text-to-Video generation | VBench augmented prompts | Quality Score86.12 | 21 | |
| Text-to-Video Generation | VBench | Imaging Quality (IQ)62.3 | 21 | |
| Text-to-Video Generation | VBench | Aesthetic Quality58.56 | 21 | |
| text-to-video generation | VBench HunyuanVideo (test) | VBench Score (%)81.4 | 21 | |
| Video Generation | VBench 1.0 (test) | Image Quality84.71 | 21 | |
| Video Generation | VBench extended prompts | Subject Consistency97.32 | 19 | |
| Video Generation | VBench | Subject Consistency97.39 | 19 | |
| Long Video Generation | VBench-Long 30 seconds | Subject Consistency98.07 | 18 |