Thinking inside the Convolution for Image Inpainting: Reconstructing Texture via Structure under Global and Local Side

About

Image inpainting has earned substantial progress, owing to the encoder-and-decoder pipeline, which is benefited from the Convolutional Neural Networks (CNNs) with convolutional downsampling to inpaint the masked regions semantically from the known regions within the encoder, coupled with an upsampling process from the decoder for final inpainting output. Recent studies intuitively identify the high-frequency structure and low-frequency texture to be extracted by CNNs from the encoder, and subsequently for a desirable upsampling recovery. However, the existing arts inevitably overlook the information loss for both structure and texture feature maps during the convolutional downsampling process, hence suffer from a non-ideal upsampling output. In this paper, we systematically answer whether and how the structure and texture feature map can mutually help to alleviate the information loss during the convolutional downsampling. Given the structure and texture feature maps, we adopt the statistical normalization and denormalization strategy for the reconstruction guidance during the convolutional downsampling process. The extensive experimental results validate its advantages to the state-of-the-arts over the images from low-to-high resolutions including 256*256 and 512*512, especially holds by substituting all the encoders by ours. Our code is available at https://github.com/htyjers/ConvInpaint-TSGL

Haipeng Liu, Yang Wang, Biao Qian, Yong Rui, Meng Wang• 2026

Related benchmarks

Task	Dataset	Result
Inpainting	Places2 Wide Mask 512x512 (test)	FID1.65	30
Inpainting	Places2 Medium Mask 512x512 (test)	FID1.26	15
Inpainting	Places2 Medium Mask 512 x 512	FID1.26	15
Inpainting	Places2 512x512 Narrow Mask (test)	FID0.58	15
Inpainting	Places2 Narrow Mask 512 x 512	FID0.58	15

Showing 5 of 5 rows

Other info

Follow for update

@wizwand_team Discord