Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

ROSE: Remove Objects with Side Effects in Videos

About

Video object removal has achieved advanced performance due to the recent success of video generative models. However, when addressing the side effects of objects, e.g., their shadows and reflections, existing works struggle to eliminate these effects for the scarcity of paired video data as supervision. This paper presents ROSE, termed Remove Objects with Side Effects, a framework that systematically studies the object's effects on environment, which can be categorized into five common cases: shadows, reflections, light, translucency and mirror. Given the challenges of curating paired videos exhibiting the aforementioned effects, we leverage a 3D rendering engine for synthetic data generation. We carefully construct a fully-automatic pipeline for data preparation, which simulates a large-scale paired dataset with diverse scenes, objects, shooting angles, and camera trajectories. ROSE is implemented as an video inpainting model built on diffusion transformer. To localize all object-correlated areas, the entire video is fed into the model for reference-based erasing. Moreover, additional supervision is introduced to explicitly predict the areas affected by side effects, which can be revealed through the differential mask between the paired videos. To fully investigate the model performance on various side effect removal, we presents a new benchmark, dubbed ROSE-Bench, incorporating both common scenarios and the five special side effects for comprehensive evaluation. Experimental results demonstrate that ROSE achieves superior performance compared to existing video object erasing models and generalizes well to real-world video scenarios. The project page is https://rose2025-inpaint.github.io/.

Chenxuan Miao, Yutong Feng, Jianshu Zeng, Zixiang Gao, Hantang Liu, Yunfeng Yan, Donglian Qi, Xi Chen, Bin Wang, Hengshuang Zhao• 2025

Related benchmarks

TaskDatasetResultRank
Video Object RemovalReal-World Videos
Internal Physics Score2.25
21
Video Object RemovalScene-Bench
Removal Completeness6.349
16
Video Object RemovalROSE Bench
LPIPS0.077
13
Video Object RemovalDAVIS
TokSim29.36
10
Video Object RemovalWIPER-Bench
TokSim30.02
9
Video Object RemovalDAVIS
mPSNR28.11
9
Video Object RemovalROSE-Benchmark with GT
PSNR31.122
8
Video Object RemovalVOR-Eval with GT
PSNR22.966
8
Video Object RemovalVOR-Wild without GT
QScore9.24
8
Video Object RemovalCAMERA-Bench
PSNR26.4577
8
Showing 10 of 20 rows

Other info

Follow for update