Learning Semantic Person Image Generation by Region-Adaptive Normalization

About

Human pose transfer has received great attention due to its wide applications, yet is still a challenging task that is not well solved. Recent works have achieved great success to transfer the person image from the source to the target pose. However, most of them cannot well capture the semantic appearance, resulting in inconsistent and less realistic textures on the reconstructed results. To address this issue, we propose a new two-stage framework to handle the pose and appearance translation. In the first stage, we predict the target semantic parsing maps to eliminate the difficulties of pose transfer and further benefit the latter translation of per-region appearance style. In the second one, with the predicted target semantic maps, we suggest a new person image generation method by incorporating the region-adaptive normalization, in which it takes the per-region styles to guide the target appearance generation. Extensive experiments show that our proposed SPGNet can generate more semantic, consistent, and photo-realistic results and perform favorably against the state of the art methods in terms of quantitative and qualitative evaluation. The source code and model are available at https://github.com/cszy98/SPGNet.git.

Zhengyao Lv, Xiaoming Li, Xin Li, Fu Li, Tianwei Lin, Dongliang He, Wangmeng Zuo• 2021

Related benchmarks

Task	Dataset	Result
Human Pose Transfer	DeepFashion In-shop Clothes Retrieval (test)	SSIM0.782	14
Pose-guided person image synthesis	DeepFashion 256 × 176 resolution (test)	FID16.184	13
Human Pose Transfer	Market-1501 (test)	SSIM0.315	7
Human Pose Transfer	DeepFashion (test)	R2G19.47	7

Showing 4 of 4 rows

Other info

Code

Follow for update

@wizwand_team Discord