Pivotal Tuning for Latent-based Editing of Real Images

About

Recently, a surge of advanced facial editing techniques have been proposed that leverage the generative power of a pre-trained StyleGAN. To successfully edit an image this way, one must first project (or invert) the image into the pre-trained generator's domain. As it turns out, however, StyleGAN's latent space induces an inherent tradeoff between distortion and editability, i.e. between maintaining the original appearance and convincingly altering some of its attributes. Practically, this means it is still challenging to apply ID-preserving facial latent-space editing to faces which are out of the generator's domain. In this paper, we present an approach to bridge this gap. Our technique slightly alters the generator, so that an out-of-domain image is faithfully mapped into an in-domain latent code. The key idea is pivotal tuning - a brief training process that preserves the editing quality of an in-domain latent region, while changing its portrayed identity and appearance. In Pivotal Tuning Inversion (PTI), an initial inverted latent code serves as a pivot, around which the generator is fined-tuned. At the same time, a regularization term keeps nearby identities intact, to locally contain the effect. This surgical training process ends up altering appearance features that represent mostly identity, without affecting editing capabilities. We validate our technique through inversion and editing metrics, and show preferable scores to state-of-the-art methods. We further qualitatively demonstrate our technique by applying advanced edits (such as pose, age, or expression) to numerous images of well-known and recognizable identities. Finally, we demonstrate resilience to harder cases, including heavy make-up, elaborate hairstyles and/or headwear, which otherwise could not have been successfully inverted and edited by state-of-the-art methods.

Daniel Roich, Ron Mokady, Amit H. Bermano, Daniel Cohen-Or• 2021

Related benchmarks

Task	Dataset	Result
3D Reconstruction	AnimeRecon 1.0 (test)	Front CLIP Score89.9	9
Face Reconstruction	CelebA-HQ	MSE0.0084	8
Reconstruction	OOD videos	LPIPS0.3144	8
Reconstruction	OOD videos Images	LPIPS0.3192	8
Identity Preservation	Face Images OOD	Accuracy (eyeglasses)91.14	8
Identity Preservation	OOD Face Videos	Eyeglasses Consistency90.49	8
Novel View Synthesis	CelebA-HQ	ID Similarity65.7	7
3D GAN Inversion	FFHQ + LPFF (test)	L2 Loss0.019	7
3D GAN Inversion	CelebAHQ (test)	L2 Error0.033	7
3D GAN Inversion	MEAD (novel views)	LPIPS (±60°)0.346	7

Showing 10 of 13 rows

Other info

Follow for update

@wizwand_team Discord