AvatarTex: High-Fidelity Facial Texture Reconstruction from Single-Image Stylized Avatars

About

We present AvatarTex, a high-fidelity facial texture reconstruction framework capable of generating both stylized and photorealistic textures from a single image. Existing methods struggle with stylized avatars due to the lack of diverse multi-style datasets and challenges in maintaining geometric consistency in non-standard textures. To address these limitations, AvatarTex introduces a novel three-stage diffusion-to-GAN pipeline. Our key insight is that while diffusion models excel at generating diversified textures, they lack explicit UV constraints, whereas GANs provide a well-structured latent space that ensures style and topology consistency. By integrating these strengths, AvatarTex achieves high-quality topology-aligned texture synthesis with both artistic and geometric coherence. Specifically, our three-stage pipeline first completes missing texture regions via diffusion-based inpainting, refines style and structure consistency using GAN-based latent optimization, and enhances fine details through diffusion-based repainting. To address the need for a stylized texture dataset, we introduce TexHub, a high-resolution collection of 20,000 multi-style UV textures with precise UV-aligned layouts. By leveraging TexHub and our structured diffusion-to-GAN pipeline, AvatarTex establishes a new state-of-the-art in multi-style facial texture reconstruction. TexHub will be released upon publication to facilitate future research in this field.

Yuda Qiu, Zitong Xiao, Yiwei Zuo, Zisheng Ye, Weikai Chen, Xiaoguang Han• 2025

Related benchmarks

Task	Dataset	Result
Image Reconstruction	FFHQ (test)	--	36
Facial Texture Reconstruction	FFHQ 1,000 images (test)	PSNR30.03	4
Facial Texture Reconstruction	LPFF 1,000 images (test)	PSNR27.91	4
Facial Texture Reconstruction	CANVAS 500 samples (test)	PSNR23.93	4

Showing 4 of 4 rows

Other info

Follow for update

@wizwand_team Discord