Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Deep Image Spatial Transformation for Person Image Generation

About

Pose-guided person image generation is to transform a source person image to a target pose. This task requires spatial manipulations of source data. However, Convolutional Neural Networks are limited by the lack of ability to spatially transform the inputs. In this paper, we propose a differentiable global-flow local-attention framework to reassemble the inputs at the feature level. Specifically, our model first calculates the global correlations between sources and targets to predict flow fields. Then, the flowed local patch pairs are extracted from the feature maps to calculate the local attention coefficients. Finally, we warp the source features using a content-aware sampling method with the obtained local attention coefficients. The results of both subjective and objective experiments demonstrate the superiority of our model. Besides, additional results in video animation and view synthesis show that our model is applicable to other tasks requiring spatial transformation. Our source code is available at https://github.com/RenYurui/Global-Flow-Local-Attention.

Yurui Ren, Xiaoming Yu, Junming Chen, Thomas H. Li, Ge Li• 2020

Related benchmarks

TaskDatasetResultRank
Human Pose TransferDeepFashion In-shop Clothes Retrieval (test)
SSIM0.79
14
Person Image GenerationDeepFashion
FID14.061
11
Person Image SynthesisDeepFashion 256 x 176 (test)
FID10.573
9
Pose TransferDeepFashion (test)
User Preference Score47.73
9
Pose TransferDeepFashion 256x256 (test)
FID10.57
7
Human Pose TransferDeepFashion (test)
R2G19.53
7
Human Pose TransferMarket-1501 (test)
SSIM0.281
7
Person Image SynthesisMarket-1501 128 x 64 (test)
FID19.751
5
Showing 8 of 8 rows

Other info

Follow for update