Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

SceneFactor: Factored Latent 3D Diffusion for Controllable 3D Scene Generation

About

We present SceneFactor, a diffusion-based approach for large-scale 3D scene generation that enables controllable generation and effortless editing. SceneFactor enables text-guided 3D scene synthesis through our factored diffusion formulation, leveraging latent semantic and geometric manifolds for generation of arbitrary-sized 3D scenes. While text input enables easy, controllable generation, text guidance remains imprecise for intuitive, localized editing and manipulation of the generated 3D scenes. Our factored semantic diffusion generates a proxy semantic space composed of semantic 3D boxes that enables controllable editing of generated scenes by adding, removing, changing the size of the semantic 3D proxy boxes that guides high-fidelity, consistent 3D geometric editing. Extensive experiments demonstrate that our approach enables high-fidelity 3D scene synthesis with effective controllable editing through our factored diffusion approach.

Alexey Bokhovkin, Quan Meng, Shubham Tulsiani, Angela Dai• 2024

Related benchmarks

TaskDatasetResultRank
3D Scene Generation3D-FRONT
P(Tr)65
5
3D Scene Geometry Synthesis3D-FRONT Independent chunks
MMD (CD)0.019
5
3D Scene Geometry Synthesis3D-FRONT Independent chunks 1.0 (test)
MMD (CD)0.021
5
Text-guided 3D scene generation3D Scenes with Qwen1.5 captions (Independent chunks)
CLIP-Score23.96
4
Text-guided 3D scene generation3D Scenes with Qwen1.5 captions (Scene chunks)
CLIP-Score23.79
4
Text-to-3D Scene Generation3D-FRONT Independent chunks
CLIP Score29.81
4
Text-to-3D Scene Generation3D-FRONT (Scene chunks)
CLIP Score29.4
4
3D Scene Geometry Synthesis3D-FRONT Scene chunks Outpainted
MMD (CD)0.021
3
3D Scene Geometry Synthesis3D-FRONT Scene chunks 1.0 (test)
MMD (CD)0.026
3
3D Scene Synthesis3D-FRONT Independent chunks
MMD (CD)0.019
3
Showing 10 of 13 rows

Other info

Code

Follow for update