Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

High-Fidelity Pluralistic Image Completion with Transformers

About

Image completion has made tremendous progress with convolutional neural networks (CNNs), because of their powerful texture modeling capacity. However, due to some inherent properties (e.g., local inductive prior, spatial-invariant kernels), CNNs do not perform well in understanding global structures or naturally support pluralistic completion. Recently, transformers demonstrate their power in modeling the long-term relationship and generating diverse results, but their computation complexity is quadratic to input length, thus hampering the application in processing high-resolution images. This paper brings the best of both worlds to pluralistic image completion: appearance prior reconstruction with transformer and texture replenishment with CNN. The former transformer recovers pluralistic coherent structures together with some coarse textures, while the latter CNN enhances the local texture details of coarse priors guided by the high-resolution masked images. The proposed method vastly outperforms state-of-the-art methods in terms of three aspects: 1) large performance boost on image fidelity even compared to deterministic completion methods; 2) better diversity and higher fidelity for pluralistic completion; 3) exceptional generalization ability on large masks and generic dataset, like ImageNet.

Ziyu Wan, Jingbo Zhang, Dongdong Chen, Jing Liao• 2021

Related benchmarks

TaskDatasetResultRank
Image InpaintingPlaces2 (test)
PSNR22.12
68
InpaintingImageNet
LPIPS0.073
54
Image InpaintingCelebA-HQ
LPIPS0.036
42
Image InpaintingFFHQ (test)
FID10.442
40
InpaintingCelebA-HQ
LPIPS0.036
36
Image InpaintingCelebA-HQ 256x256 (test)
FID5.24
19
Image InpaintingCelebA-HQ 512x512 (test)
LPIPS0.105
16
Image InpaintingCelebA with irregular mask 40-60% mask ratio
PSNR21.84
8
Image InpaintingCelebA with irregular mask 0-20% mask ratio
PSNR33.27
8
Image InpaintingCelebA with irregular mask 20-40% mask ratio
PSNR26.4
8
Showing 10 of 15 rows

Other info

Code

Follow for update