Learning Pyramid-Context Encoder Network for High-Quality Image Inpainting
About
High-quality image inpainting requires filling missing regions in a damaged image with plausible content. Existing works either fill the regions by copying image patches or generating semantically-coherent patches from region context, while neglect the fact that both visual and semantic plausibility are highly-demanded. In this paper, we propose a Pyramid-context ENcoder Network (PEN-Net) for image inpainting by deep generative models. The PEN-Net is built upon a U-Net structure, which can restore an image by encoding contextual semantics from full resolution input, and decoding the learned semantic features back into images. Specifically, we propose a pyramid-context encoder, which progressively learns region affinity by attention from a high-level semantic feature map and transfers the learned attention to the previous low-level feature map. As the missing content can be filled by attention transfer from deep to shallow in a pyramid fashion, both visual and semantic coherence for image inpainting can be ensured. We further propose a multi-scale decoder with deeply-supervised pyramid losses and an adversarial loss. Such a design not only results in fast convergence in training, but more realistic results in testing. Extensive experiments on various datasets show the superior performance of the proposed network
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Image Inpainting | Places2 (test) | PSNR23.19 | 68 | |
| Image Inpainting | Places2 irregular mask (val) | PSNR26.78 | 48 | |
| Image Inpainting | Places2 (evaluation) | L1 Error0.69 | 42 | |
| Image Inpainting | Places2 (0.5, 0.6] (test) | PSNR18.29 | 9 | |
| Image Inpainting | Places2 (0.01, 0.1] (test) | PSNR31.61 | 9 | |
| Image Inpainting | Places2 (0.1, 0.2] (test) | PSNR25.76 | 9 | |
| Image Inpainting | Places2 (0.2, 0.3] (test) | PSNR23.04 | 9 | |
| Image Inpainting | Places2 (0.3, 0.4] (test) | PSNR21.07 | 9 | |
| Image Inpainting | Places2 (0.4, 0.5] (test) | PSNR19.39 | 9 | |
| Wide-Range Image Blending | Scenery dataset (test) | FID159.7 | 8 |