| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Online overhead computation | WAN3 | Time (s)0.049 | 32 | |
| Online overhead computation | WAN 1 | Latency (s)0.049 | 32 | |
| Text-to-Video generation | Wan 1.3B 2.1 | CLIPSIM0.193 | 27 | |
| Text-to-Video generation | Wan2.1 14B CFG = 5.0, 720 × 1280p, frames = 80 (test) | CLIPSIM0.182 | 11 | |
| Video Generation | Wan2.1 14B (test) | CLIPSIM0.183 | 11 | |
| Identity Protection for Image-to-Video Generation | Wan TI2V 5B 2.2 | ISM0.672 | 7 | |
| Text-to-Video Generation | Wan 14B 2.2 | ImgQual71.2 | 7 | |
| Image-to-Video Generation | Wan 14B 2.2 | Image Quality Score0.704 | 7 | |
| Video Generation | Wan-14B 720P 81 frames 2.1 (test) | VBench Score83.62 | 7 | |
| Video Generation | Wan2.1-14B 69 frames (test) | Vision Reward0.136 | 7 | |
| Video Generation | Wan2.1 1.3B | PSNR26.6 | 6 | |
| Video Generation | Wan2.1-1.3B 480p resolution (test) | IQ67.68 | 6 | |
| Video Generation | Wan 14B 720p resolution 2.1 (test) | IQ69.08 | 6 | |
| Image-to-Video Generation | Wan 2.2 | Q-Save10.26 | 6 | |
| Video Generation | Wan2.1-1.3B 4-step distilled | VBench0.846 | 6 | |
| Text-to-Video Generation | Wan 1.3B-T2V (81 frames, 480P, 50steps) 2.1 | CLIP-SCORE30.3637 | 5 | |
| Text-to-Video Generation | Wan2.1-14B-T2V 81 frames, 480P, 50steps | CLIP-SCORE31.4445 | 5 | |
| Text-to-Video Generation | Wan2.1-14B-T2V-distilled 81 frames, 480P, 4steps, no cfg | CLIP-SCORE31.997 | 5 | |
| Video Generation | Wan-1.3B | Speedup2.25 | 5 | |
| Video Generation | Wan t2v-1.3B 2.1 | Vendi-v Score0.155 | 5 | |
| Video Generation | Wan 1.3B 50-step base 2.1 | PSNR17.22 | 5 | |
| Text-to-Video Generation | Wan Text-to-Video 720P 2.1-14B | T.F.98.48 | 4 | |
| Text-to-Video Generation | Wan 1.3B Text-to-Video 480P 2.1 | Temporal Fidelity97.36 | 4 | |
| Training | wan 1.3B 2.2 | Step Time1.22 | 4 | |
| Generative Diversity Evaluation | Wan T2V-14B 2.1 | DINOv3 Score0.826 | 4 |