Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Frame Context Packing and Drift Prevention in Next-Frame-Prediction Video Diffusion Models

About

We present a neural network structure, FramePack, to train next-frame (or next-frame-section) prediction models for video generation. FramePack compresses input frame contexts with frame-wise importance so that more frames can be encoded within a fixed context length, with more important frames having longer contexts. The frame importance can be measured using time proximity, feature similarity, or hybrid metrics. The packing method allows for inference with thousands of frames and training with relatively large batch sizes. We also present drift prevention methods to address observation bias (error accumulation), including early-established endpoints, adjusted sampling orders, and discrete history representation. Ablation studies validate the effectiveness of the anti-drifting methods in both single-directional video streaming and bi-directional video generation. Finally, we show that existing video diffusion models can be finetuned with FramePack, and analyze the differences between different packing schedules.

Lvmin Zhang, Shengqu Cai, Muyang Li, Gordon Wetzstein, Maneesh Agrawala• 2025

Related benchmarks

TaskDatasetResultRank
Video GenerationVBench--
102
Video GenerationVBench 30-second generation
Imaging Quality83.61
11
Video GenerationVBench-Long 30s videos
FPS0.92
8
Cinematic Video GenerationScene-Decoupled Video Dataset (test)
CLIP-T32.69
6
Cinematic Video GenerationDiT360 OOD (test)
CLIP Score (T)0.3223
6
Long Video GenerationVBench general long video generation
Img Quality69.72
6
Long Video GenerationVBench-Long 100 customized narrative scripts 60s 1.0 (test)
Quality Score84.4
5
Long Video GenerationSingle-prompt 30-second video generation (test)
Total Score81.95
5
Instructional Streaming Video GenerationEgo-Exo4D KeyStep (val)
Average Score0.1609
5
Text-to-Video GenerationMovieGen-Bench 60s 100 text prompts long-form evaluation
Dino Score (10s)81.86
5
Showing 10 of 10 rows

Other info

Follow for update