| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Online overhead computation | WAN3 | Time (s)0.049 | 32 | |
| Online overhead computation | WAN 1 | Latency (s)0.049 | 32 | |
| Text-to-Video generation | Wan 1.3B 2.1 | CLIPSIM0.193 | 27 | |
| Video Generation Inference Speed | Wan 768P Video (81 frames, 1280x768, without padding) 2.1 (inference evaluation) | Inference Time (s)92 | 17 | |
| Long Video Quality Evaluation | Wan 2.2 | Spearman Correlation0.835 | 12 | |
| Text-to-Video generation | Wan2.1 14B CFG = 5.0, 720 × 1280p, frames = 80 (test) | CLIPSIM0.182 | 11 | |
| Video Generation | Wan2.1 14B (test) | CLIPSIM0.183 | 11 | |
| Video Generation | Wan 1.3B (81 frames, 832×480) 2.1 | VBench Score81.3 | 10 | |
| Video Reconstruction | Wan2.1 Evaluation Set | Latency (s)131 | 10 | |
| AI-generated video detection | Wan Frontier Commercial Generators | Accuracy85.55 | 7 | |
| Video Aesthetic Evaluation | Wan 2.2 (test) | LAP5.623 | 7 | |
| Identity Protection for Image-to-Video Generation | Wan TI2V 5B 2.2 | ISM0.672 | 7 | |
| Text-to-Video Generation | Wan 14B 2.2 | ImgQual71.2 | 7 | |
| Image-to-Video Generation | Wan 14B 2.2 | Image Quality Score0.704 | 7 | |
| Video Generation | Wan-14B 720P 81 frames 2.1 (test) | VBench Score83.62 | 7 | |
| Video Generation | Wan2.1-14B 69 frames (test) | Vision Reward0.136 | 7 | |
| Text-to-Video | Wan 81 frames, 480p 2.2 | FLOPs (P)2.67 | 6 | |
| Text-to-Video | Wan 81 frames 720p 2.2 | FLOPs (P)9.87 | 6 | |
| Video Generation | Wan high-noise 2.2 | TA Mean4.5857 | 6 | |
| Text-to-Video Generation | Wan2.1-1.3B (4x native length) | Subject Consistency0.9865 | 6 | |
| Video Generation | Wan2.1 1.3B | PSNR26.6 | 6 | |
| Video Generation | Wan2.1-1.3B 480p resolution (test) | IQ67.68 | 6 | |
| Video Generation | Wan 14B 720p resolution 2.1 (test) | IQ69.08 | 6 | |
| Image-to-Video Generation | Wan 2.2 | Q-Save10.26 | 6 | |
| Video Generation | Wan2.1-1.3B 4-step distilled | VBench0.846 | 6 |