TEXTure: Text-Guided Texturing of 3D Shapes
About
In this paper, we present TEXTure, a novel method for text-guided generation, editing, and transfer of textures for 3D shapes. Leveraging a pretrained depth-to-image diffusion model, TEXTure applies an iterative scheme that paints a 3D model from different viewpoints. Yet, while depth-to-image models can create plausible textures from a single viewpoint, the stochastic nature of the generation process can cause many inconsistencies when texturing an entire 3D object. To tackle these problems, we dynamically define a trimap partitioning of the rendered image into three progression states, and present a novel elaborated diffusion sampling process that uses this trimap representation to generate seamless textures from different views. We then show that one can transfer the generated texture maps to new 3D geometries without requiring explicit surface-to-surface mapping, as well as extract semantic textures from a set of images without requiring any explicit reconstruction. Finally, we show that TEXTure can be used to not only generate new textures but also edit and refine existing textures using either a text prompt or user-provided scribbles. We demonstrate that our TEXTuring method excels at generating, transferring, and editing textures through extensive evaluation, and further close the gap between 2D image generation and 3D texturing.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Texture Reconstruction | THuman 2.0 (test) | PSNR16.7869 | 8 | |
| 3D Human Generation | User Study 30 prompts | Q1 Best Preference Rate3.64 | 8 | |
| Text-to-3D Human Generation | 30 prompt set Stable Diffusion V1.5 1.0 (test) | FID142.8 | 7 | |
| 3D Texture Editing | 24 3D meshes (test) | CLIP Score31.22 | 5 | |
| Text-guided 3D Shape Texturing | Prompt Set (test) | CLIP Similarity29.25 | 5 | |
| Text-to-texture synthesis | Objaverse subset | FID38.4 | 5 | |
| Text-to-Texture Generation | 3D biped cartoon dataset (test) | Quality Score2.67 | 5 | |
| Text-to-3D Human Generation | T3H Evaluation (test) | Generation Time (sec)103.7 | 4 | |
| Image-conditioned texture generation | User Study | Image Alignment1.92 | 4 | |
| PBR Texture Generation | PBR Texture Generation Relighted (test) | PSNR16.21 | 4 |