Pose with Style: Detail-Preserving Pose-Guided Image Synthesis with Conditional StyleGAN
About
We present an algorithm for re-rendering a person from a single image under arbitrary poses. Existing methods often have difficulties in hallucinating occluded contents photo-realistically while preserving the identity and fine details in the source image. We first learn to inpaint the correspondence field between the body surface texture and the source image with a human body symmetry prior. The inpainted correspondence field allows us to transfer/warp local features extracted from the source to the target view even under large pose changes. Directly mapping the warped local features to an RGB image using a simple CNN decoder often leads to visible artifacts. Thus, we extend the StyleGAN generator so that it takes pose as input (for controlling poses) and introduces a spatially varying modulation for the latent space using the warped local features (for controlling appearances). We show that our method compares favorably against the state-of-the-art algorithms in both quantitative evaluation and visual comparison.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Virtual Try-On | StreetTryOn Model-to-Model | FID34.858 | 11 | |
| Virtual Try-On | StreetTryOn Model-to-Street | FID77.274 | 11 | |
| Virtual Try-On | StreetTryOn Street-to-Street | FID84.99 | 11 | |
| Virtual Try-On | Model2Street (test) | FID76.889 | 9 | |
| Virtual Try-On | Model2Model (test) | FID34.224 | 9 | |
| Virtual Try-On | Street2Street (test) | FID84.326 | 9 |