Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

SceneMaker: Open-set 3D Scene Generation with Decoupled De-occlusion and Pose Estimation Model

About

We propose a decoupled 3D scene generation framework called SceneMaker in this work. Due to the lack of sufficient open-set de-occlusion and pose estimation priors, existing methods struggle to simultaneously produce high-quality geometry and accurate poses under severe occlusion and open-set settings. To address these issues, we first decouple the de-occlusion model from 3D object generation, and enhance it by leveraging image datasets and collected de-occlusion datasets for much more diverse open-set occlusion patterns. Then, we propose a unified pose estimation model that integrates global and local mechanisms for both self-attention and cross-attention to improve accuracy. Besides, we construct an open-set 3D scene dataset to further extend the generalization of the pose estimation model. Comprehensive experiments demonstrate the superiority of our decoupled framework on both indoor and open-set scenes. Our codes and datasets is released at https://idea-research.github.io/SceneMaker/.

Yukai Shi, Weiyu Li, Zihao Wang, Hongyang Li, Xingyu Chen, Ping Tan, Lei Zhang• 2025

Related benchmarks

TaskDatasetResultRank
3D Scene Generation3D-Front (test)
CD (Surface)0.0381
12
Scene GenerationMIDI (test)
CD-S5.1
9
Scene GenerationOpen-set (test)
CD-S15.38
4
De-occlusionCollected 1K images, 500 classes (val)
PSNR15.03
3
Object Generation3D-Front rendered by InstPifu (test)
Chamfer Distance0.0409
3
Showing 5 of 5 rows

Other info

Follow for update