Coherent Online Video Style Transfer
About
Training a feed-forward network for fast neural style transfer of images is proven to be successful. However, the naive extension to process video frame by frame is prone to producing flickering results. We propose the first end-to-end network for online video style transfer, which generates temporally coherent stylized video sequences in near real-time. Two key ideas include an efficient network by incorporating short-term coherence, and propagating short-term coherence to long-term, which ensures the consistency over larger period of time. Our network can incorporate different image stylization networks. We show that the proposed method clearly outperforms the per-frame baseline both qualitatively and quantitatively. Moreover, it can achieve visually comparable coherence to optimization-based video style transfer, but is three orders of magnitudes faster in runtime.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Video Style Transfer | Videvo.net | Q1 Score64 | 6 | |
| video-to-video synthesis | Apolloscape | Human Preference Score0.41 | 4 | |
| video-to-video synthesis | Cityscapes short sequences (val) | FID (I3D)5.55 | 4 | |
| video-to-video synthesis | Cityscapes (long sequences (demo videos)) | Human Preference Score20 | 4 | |
| Video Style Transfer | MPI Sintel (val) | Alley-2 Score0.0934 | 4 |