SceneLinker: Compositional 3D Scene Generation via Semantic Scene Graph from RGB Sequences

About

We introduce SceneLinker, a novel framework that generates compositional 3D scenes via semantic scene graph from RGB sequences. To adaptively experience Mixed Reality (MR) content based on each user's space, it is essential to generate a 3D scene that reflects the real-world layout by compactly capturing the semantic cues of the surroundings. Prior works struggled to fully capture the contextual relationship between objects or mainly focused on synthesizing diverse shapes, making it challenging to generate 3D scenes aligned with object arrangements. We address these challenges by designing a graph network with cross-check feature attention for scene graph prediction and constructing a graph-variational autoencoder (graph-VAE), which consists of a joint shape and layout block for 3D scene generation. Experiments on the 3RScan/3DSSG and SG-FRONT datasets demonstrate that our approach outperforms state-of-the-art methods in both quantitative and qualitative evaluations, even in complex indoor environments and under challenging scene graph constraints. Our work enables users to generate consistent 3D spaces from their physical environments via scene graphs, allowing them to create spatial MR content. Project page is https://scenelinker2026.github.io.

Seok-Young Kim, Dooyoung Kim, Woojin Cho, Hail Song, Suji Kang, Woontack Woo• 2026

Related benchmarks

Task	Dataset	Result
3D Scene Generation	SG-FRONT (test)	Left/Right Accuracy98	11
3D Scene Graph Prediction	3RScan 160 object and 26 predicate classes (test)	Recall (Rel.)68.7	6
Scene graph prediction	3RScan 20 object and 8 predicate classes (test)	Recall (Relationship)68.3	6

Showing 3 of 3 rows

Other info

Follow for update

@wizwand_team Discord