Controllable Person Image Synthesis with Attribute-Decomposed GAN
About
This paper introduces the Attribute-Decomposed GAN, a novel generative model for controllable person image synthesis, which can produce realistic person images with desired human attributes (e.g., pose, head, upper clothes and pants) provided in various source inputs. The core idea of the proposed model is to embed human attributes into the latent space as independent codes and thus achieve flexible and continuous control of attributes via mixing and interpolation operations in explicit style representations. Specifically, a new architecture consisting of two encoding pathways with style block connections is proposed to decompose the original hard mapping into multiple more accessible subtasks. In source pathway, we further extract component layouts with an off-the-shelf human parser and feed them into a shared global texture encoder for decomposed latent codes. This strategy allows for the synthesis of more realistic output images and automatic separation of un-annotated attributes. Experimental results demonstrate the proposed method's superiority over the state of the art in pose transfer and its effectiveness in the brand-new task of component attribute transfer.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Human Pose Transfer | DeepFashion In-shop Clothes Retrieval (test) | SSIM0.772 | 14 | |
| Person Image Generation | DeepFashion | FID18.395 | 11 | |
| Pose Transfer | DeepFashion (test) | User Preference Score73.55 | 9 | |
| Person Image Synthesis | DeepFashion 256 x 176 (test) | FID14.458 | 9 | |
| Pose-guided Person Image Generation | DeepFashion 8750 images (test) | FID16 | 7 | |
| Pose Transfer | DeepFashion reduced (test) | FID20.025 | 7 | |
| Garment Transfer | Dance50k (test) | SSIM76.5 | 4 | |
| Garment Transfer | DeepFashion (test) | SSIM64.3 | 4 | |
| Pose Transfer | DeepFashion 256x176 (test) | SSIM0.772 | 3 | |
| Virtual Try-On | DeepFashion (test) | User Preference Score19.36 | 2 |