Text2Tex: Text-driven Texture Synthesis via Diffusion Models

About

We present Text2Tex, a novel method for generating high-quality textures for 3D meshes from the given text prompts. Our method incorporates inpainting into a pre-trained depth-aware image diffusion model to progressively synthesize high resolution partial textures from multiple viewpoints. To avoid accumulating inconsistent and stretched artifacts across views, we dynamically segment the rendered view into a generation mask, which represents the generation status of each visible texel. This partitioned view representation guides the depth-aware inpainting model to generate and update partial textures for the corresponding regions. Furthermore, we propose an automatic view sequence generation scheme to determine the next best view for updating the partial texture. Extensive experiments demonstrate that our method significantly outperforms the existing text-driven approaches and GAN-based methods.

Dave Zhenyu Chen, Yawar Siddiqui, Hsin-Ying Lee, Sergey Tulyakov, Matthias Nie{\ss}ner• 2023

Related benchmarks

Task	Dataset	Result
Text-guided visual synthesis	Objaverse	FID41.62	14
Texture Synthesis	3D-Front (test)	CLIP Score20.83	7
Text-to-PBR-Texture Generation	Objaverse (test)	Shaded Image FID-CLIP4.533	6
Texture Map Synthesis	Text-conditioned texture map synthesis (test)	CMMD2.811	6
Garment Texture Synthesis	Custom Garment Texture Synthesis Dataset	FID77.28	6
Text-to-texture synthesis	Objaverse subset	FID37.89	5
Text-to-Texture Generation	3D biped cartoon dataset (test)	Quality Score2.72	5
3D Scene Editing	3D-Front (test)	CS27.8	5
Text-guided 3D Shape Texturing	Prompt Set (test)	CLIP Similarity28.81	5

Showing 9 of 9 rows

Other info

Follow for update

@wizwand_team Discord