Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Generative Omnimatte: Learning to Decompose Video into Layers

About

Given a video and a set of input object masks, an omnimatte method aims to decompose the video into semantically meaningful layers containing individual objects along with their associated effects, such as shadows and reflections. Existing omnimatte methods assume a static background or accurate pose and depth estimation and produce poor decompositions when these assumptions are violated. Furthermore, due to the lack of generative prior on natural videos, existing methods cannot complete dynamic occluded regions. We present a novel generative layered video decomposition framework to address the omnimatte problem. Our method does not assume a stationary scene or require camera pose or depth information and produces clean, complete layers, including convincing completions of occluded dynamic regions. Our core idea is to train a video diffusion model to identify and remove scene effects caused by a specific object. We show that this model can be finetuned from an existing video inpainting model with a small, carefully curated dataset, and demonstrate high-quality decompositions and editing results for a wide range of casually captured videos containing soft shadows, glossy reflections, splashing water, and more.

Yao-Chih Lee, Erika Lu, Sarah Rumbley, Michal Geyer, Jia-Bin Huang, Tali Dekel, Forrester Cole• 2024

Related benchmarks

TaskDatasetResultRank
Video Object RemovalReal-World Videos
Internal Physics Score2.3
21
Background layer reconstructionSynthetic Movie scenes OmnimatteRF benchmark (test)
PSNR38.38
13
Video Object RemovalROSE Bench
LPIPS0.1013
13
Video Object RemovalDAVIS
mPSNR27.56
9
Video Object RemovalDAVIS 2016
CLIP-T0.2814
7
Video Object RemovalBridgeRemoval-Bench
CLIP-T0.2966
7
Video Object RemovalBridgeRemoval-Bench 1.0 (test)
Motion Smoothness99.25
7
Video Object RemovalDAVIS (test)
Motion Smoothness0.9757
7
Video Object and Interaction DeletionReal-world videos (75 scenarios)
Win %11.2
7
Video Object RemovalSynthetic (Kubric + HUMOTO) (test)
PSNR29.44
7
Showing 10 of 22 rows

Other info

Code

Follow for update