Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Learning Hierarchical Semantic Image Manipulation through Structured Representations

About

Understanding, reasoning, and manipulating semantic concepts of images have been a fundamental research problem for decades. Previous work mainly focused on direct manipulation on natural image manifold through color strokes, key-points, textures, and holes-to-fill. In this work, we present a novel hierarchical framework for semantic image manipulation. Key to our hierarchical framework is that we employ a structured semantic layout as our intermediate representation for manipulation. Initialized with coarse-level bounding boxes, our structure generator first creates pixel-wise semantic layout capturing the object shape, object-object interactions, and object-scene relations. Then our image generator fills in the pixel-level textures guided by the semantic layout. Such framework allows a user to manipulate images at object-level by adding, removing, and moving one bounding box at a time. Experimental evaluations demonstrate the advantages of the hierarchical manipulation framework over existing image generation and context hole-filing models, both qualitatively and quantitatively. Benefits of the hierarchical framework are further demonstrated in applications such as semantic object manipulation, interactive image editing, and data-driven image manipulation.

Seunghoon Hong, Xinchen Yan, Thomas Huang, Honglak Lee• 2018

Related benchmarks

TaskDatasetResultRank
Semantic Image EditingCityscapes
FID15.58
21
Semantic Image EditingADE20K-Room
FID28.64
21
Object AdditionCityscapes (test)
FID6.92
7
Human preference studyFlickr Landscapes (test)
Preference Score22.5
7
Object AdditionADE20K Room (test)
FID29.1
7
Object RemovalCityscapes
SSIM0.584
2
Object RemovalADE20K
SSIM0.456
2
Showing 7 of 7 rows

Other info

Follow for update