Pose Guided Person Image Generation
About
This paper proposes the novel Pose Guided Person Generation Network (PG$^2$) that allows to synthesize person images in arbitrary poses, based on an image of that person and a novel pose. Our generation framework PG$^2$ utilizes the pose information explicitly and consists of two key stages: pose integration and image refinement. In the first stage the condition image and the target pose are fed into a U-Net-like network to generate an initial but coarse image of the person with the target pose. The second stage then refines the initial and blurry result by training a U-Net-like generator in an adversarial way. Extensive experimental results on both 128$\times$64 re-identification images and 256$\times$256 fashion photos show that our model generates high-quality person images with convincing details.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Person Image Generation | Market-1501 (test) | SSIM0.261 | 25 | |
| Person Image Generation | DeepFashion (test) | SSIM0.773 | 19 | |
| Pose-guided Human Image Generation | Market 1501 | R2G Score11.2 | 13 | |
| Hand gesture-to-gesture translation | Senz3D (test) | FID31.7333 | 11 | |
| Person Image Generation | DeepFashion | -- | 11 | |
| Person Image Synthesis | DeepFashion (test) | SSIM0.773 | 10 | |
| Pose Transfer | DeepFashion (test) | User Preference Score1.61 | 9 | |
| Face Reenactment | same source | AU (%)82.7 | 7 | |
| Face Reenactment | cross source | AU (%)82.6 | 7 | |
| Face Reenactment | in the wild | AU %82.3 | 7 |