Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

MAGI-1: Autoregressive Video Generation at Scale

About

We present MAGI-1, a world model that generates videos by autoregressively predicting a sequence of video chunks, defined as fixed-length segments of consecutive frames. Trained to denoise per-chunk noise that increases monotonically over time, MAGI-1 enables causal temporal modeling and naturally supports streaming generation. It achieves strong performance on image-to-video (I2V) tasks conditioned on text instructions, providing high temporal consistency and scalability, which are made possible by several algorithmic innovations and a dedicated infrastructure stack. MAGI-1 facilitates controllable generation via chunk-wise prompting and supports real-time, memory-efficient deployment by maintaining constant peak inference cost, regardless of video length. The largest variant of MAGI-1 comprises 24 billion parameters and supports context lengths of up to 4 million tokens, demonstrating the scalability and robustness of our approach. The code and models are available at https://github.com/SandAI-org/MAGI-1 and https://github.com/SandAI-org/MagiAttention. The product can be accessed at https://sand.ai.

Sand.ai, Hansi Teng, Hongyu Jia, Lei Sun, Lingzhi Li, Maolin Li, Mingqiu Tang, Shuai Han, Tianning Zhang, W.Q. Zhang, Weifeng Luo, Xiaoyang Kang, Yuchen Sun, Yue Cao, Yunpeng Huang, Yutong Lin, Yuxin Fang, Zewei Tao, Zheng Zhang, Zhongshu Wang, Zixun Liu, Dai Shi, Guoli Su, Hanwen Sun, Hong Pan, Jie Wang, Jiexin Sheng, Min Cui, Min Hu, Ming Yan, Shucheng Yin, Siran Zhang, Tingting Liu, Xianping Yin, Xiaoyu Yang, Xin Song, Xuan Hu, Yankai Zhang, Yuqiao Li• 2025

Related benchmarks

TaskDatasetResultRank
Long Video GenerationVBench-Long 60 seconds
Subject Consistency79.46
74
Video GenerationVBench 5s
Quality Score82.04
73
Video GenerationVBench (test)
Semantic Score72.02
66
Video GenerationVBench Long
Motion Smoothness99.1
49
Video Generationshort videos 81-frames 240 prompts
Total Score5.25
38
Text-to-Video GenerationVBench (test)
Total Score79.18
37
Video GenerationVBench
Motion Smoothness98.43
37
Image-to-Video GenerationVBench I2V--
24
Text-to-Video GenerationStoryEval-Bench 1.0 (test)
Human Score39.6
22
Video GenerationVBench 1.0 (test)
Image Quality0.6066
21
Showing 10 of 48 rows

Other info

Follow for update