Allegro: Open the Black Box of Commercial-Level Video Generation Model

About

Significant advancements have been made in the field of video generation, with the open-source community contributing a wealth of research papers and tools for training high-quality models. However, despite these efforts, the available information and resources remain insufficient for achieving commercial-level performance. In this report, we open the black box and introduce $\textbf{Allegro}$, an advanced video generation model that excels in both quality and temporal consistency. We also highlight the current limitations in the field and present a comprehensive methodology for training high-performance, commercial-level video generation models, addressing key aspects such as data, model architecture, training pipeline, and evaluation. Our user study shows that Allegro surpasses existing open-source models and most commercial models, ranking just behind Hailuo and Kling. Code: https://github.com/rhymes-ai/Allegro , Model: https://huggingface.co/rhymes-ai/Allegro , Gallery: https://rhymes.ai/allegro_gallery .

Yuan Zhou, Qiuyue Wang, Yuxuan Cai, Huan Yang• 2024

Related benchmarks

Task	Dataset	Result
Text-to-Video Generation	VBench	Quality Score83.1	209
Video Generation	UCF-101 (test)	Inception Score67.16	105
Video Generation	VideoPhy	SA (%)51.27	50
Video Reconstruction	WebVid 10M	PSNR32.18	45
3D Scene Generation	WorldScore	Camera Control24.84	33
Video Generation	WorldScore (test)	Average Score53.64	27
Video Reconstruction	Panda-70M	PSNR31.7	21
Video Generation	SkyTimelapse (test)	FVD16117.3	16
Video Generation	WorldModelBench	Instruction Score1.91	11

Showing 9 of 9 rows

Other info

Follow for update

@wizwand_team Discord