| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Image2World generation | PAI-Bench | Domain Score Average0.86 | 17 | |
| Text2World (T2W) Generation | PAI-Bench | Latency (s)26.28 | 14 | |
| Embodied Video Generation | PAI-Bench robot domain | AQ54.77 | 10 | |
| Physical Perception | PAI-Bench | PAI-Bench Score68.5 | 9 | |
| Image-to-Video | PAI-Bench VBench (test) | Delta Vote (%)0 | 8 | |
| Text-to-Video | PAI-Bench VBench (test) | Delta Vote (%)0 | 8 | |
| Image-to-Video | PAI-Bench VideoAlign (test) | ∆-Vote (%)0 | 8 | |
| Text-to-Video | PAI-Bench VideoAlign (test) | Delta Vote (%)0 | 8 | |
| Robotics Image-to-Video Generation | PAI-Bench-G | Grasp Success89.6 | 8 | |
| Text-to-World | PAI-Bench | Domain Average78.62 | 3 |