Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

GENESIS: Generative Scene Inference and Sampling with Object-Centric Latent Representations

About

Generative latent-variable models are emerging as promising tools in robotics and reinforcement learning. Yet, even though tasks in these domains typically involve distinct objects, most state-of-the-art generative models do not explicitly capture the compositional nature of visual scenes. Two recent exceptions, MONet and IODINE, decompose scenes into objects in an unsupervised fashion. Their underlying generative processes, however, do not account for component interactions. Hence, neither of them allows for principled sampling of novel scenes. Here we present GENESIS, the first object-centric generative model of 3D visual scenes capable of both decomposing and generating scenes by capturing relationships between scene components. GENESIS parameterises a spatial GMM over images which is decoded from a set of object-centric latent variables that are either inferred sequentially in an amortised fashion or sampled from an autoregressive prior. We train GENESIS on several publicly available datasets and evaluate its performance on scene generation, decomposition, and semi-supervised learning.

Martin Engelcke, Adam R. Kosiorek, Oiwi Parker Jones, Ingmar Posner• 2019

Related benchmarks

TaskDatasetResultRank
Stability ClassificationShapeStacks (test)
Accuracy64
6
View ClassificationShapeStacks (test)
Accuracy99.7
6
Height ClassificationShapeStacks (test)
Accuracy80.8
6
Scene GenerationMulti-dSprites (test)
FID24.9
5
Scene GenerationGQN (test)
FID70.2
5
Object SegmentationObjectsRoom (test)
ARI (FG)63
4
Unsupervised Object SegmentationAPC (test)
ARI0.04
4
Object SegmentationShapeStacks (test)
ARI-FG0.7
4
Image GenerationAPC
FID183.2
3
Scene GenerationObjectsRoom
ELBO-7.02e+3
3
Showing 10 of 12 rows

Other info

Code

Follow for update