Towards Implicit Text-Guided 3D Shape Generation

About

In this work, we explore the challenging task of generating 3D shapes from text. Beyond the existing works, we propose a new approach for text-guided 3D shape generation, capable of producing high-fidelity shapes with colors that match the given text description. This work has several technical contributions. First, we decouple the shape and color predictions for learning features in both texts and shapes, and propose the word-level spatial transformer to correlate word features from text with spatial features from shape. Also, we design a cyclic loss to encourage consistency between text and shape, and introduce the shape IMLE to diversify the generated shapes. Further, we extend the framework to enable text-guided shape manipulation. Extensive experiments on the largest existing text-shape benchmark manifest the superiority of this work. The code and the models are available at https://github.com/liuzhengzhe/Towards-Implicit Text-Guided-Shape-Generation.

Zhengzhe Liu, Yi Wang, Xiaojuan Qi, Chi-Wing Fu• 2022

Related benchmarks

Task	Dataset	Result
Text-conditioned 3D shape generation	Text2Shape (original)	CLIP-S38.88	4
Recursive Text-conditioned 3D Shape Generation	ShapeGlot [1, 2] phrases	CLIP-S Score27.2	4
Text-to-Shape Generation	Text2Shape	Accuracy34.79	4
3D Generation	ModelNet40 Table (test)	FPD1.64e+3	3
3D Generation	ModelNet40 Chair (test)	FPD1.57e+3	3
Text-to-Shape Generation	Text2Shape 5 (test)	IoU12.21	3
Text-guided 3D Shape Generation	ShapeNet	IoU12.21	2
Recursive Text-conditioned 3D Shape Generation	ShapeGlot (2, 4] phrases	CLIP-S42.32	2
Recursive Text-conditioned 3D Shape Generation	ShapeGlot (4, +∞) phrases	CLIP-S42.84	2

Showing 9 of 9 rows

Other info

Code

Follow for update

@wizwand_team Discord