Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Efficient Scale-Invariant Generator with Column-Row Entangled Pixel Synthesis

About

Any-scale image synthesis offers an efficient and scalable solution to synthesize photo-realistic images at any scale, even going beyond 2K resolution. However, existing GAN-based solutions depend excessively on convolutions and a hierarchical architecture, which introduce inconsistency and the $``$texture sticking$"$ issue when scaling the output resolution. From another perspective, INR-based generators are scale-equivariant by design, but their huge memory footprint and slow inference hinder these networks from being adopted in large-scale or real-time systems. In this work, we propose $\textbf{C}$olumn-$\textbf{R}$ow $\textbf{E}$ntangled $\textbf{P}$ixel $\textbf{S}$ynthesis ($\textbf{CREPS}$), a new generative model that is both efficient and scale-equivariant without using any spatial convolutions or coarse-to-fine design. To save memory footprint and make the system scalable, we employ a novel bi-line representation that decomposes layer-wise feature maps into separate $``$thick$"$ column and row encodings. Experiments on various datasets, including FFHQ, LSUN-Church, MetFaces, and Flickr-Scenery, confirm CREPS' ability to synthesize scale-consistent and alias-free images at any arbitrary resolution with proper training and inference speed. Code is available at https://github.com/VinAIResearch/CREPS.

Thuan Hoang Nguyen, Thanh Van Le, Anh Tran• 2023

Related benchmarks

TaskDatasetResultRank
Unconditional Image GenerationLSUN Church 256x256
FID5.5
14
Unconditional image synthesisFFHQ 1024
FID4.09
12
Image SynthesisFFHQ 1024 (test)
FID (50k)4.09
9
Image SynthesisLSUN Church 256x256 (test)
FID5.5
6
Image SynthesisFFHQ 512 (test)
FID4.43
3
Unconditional image synthesisFFHQ 512
FID4.43
3
Unconditional image synthesisScenery-256
FID7.21
3
Unconditional image synthesisMetFaces 1024
FID20.52
2
Showing 8 of 8 rows

Other info

Code

Follow for update