Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

T2M-GPT: Generating Human Motion from Textual Descriptions with Discrete Representations

About

In this work, we investigate a simple and must-known conditional generative framework based on Vector Quantised-Variational AutoEncoder (VQ-VAE) and Generative Pre-trained Transformer (GPT) for human motion generation from textural descriptions. We show that a simple CNN-based VQ-VAE with commonly used training recipes (EMA and Code Reset) allows us to obtain high-quality discrete representations. For GPT, we incorporate a simple corruption strategy during the training to alleviate training-testing discrepancy. Despite its simplicity, our T2M-GPT shows better performance than competitive approaches, including recent diffusion-based approaches. For example, on HumanML3D, which is currently the largest dataset, we achieve comparable performance on the consistency between text and generated motion (R-Precision), but with FID 0.116 largely outperforming MotionDiffuse of 0.630. Additionally, we conduct analyses on HumanML3D and observe that the dataset size is a limitation of our approach. Our work suggests that VQ-VAE still remains a competitive approach for human motion generation.

Jianrong Zhang, Yangsong Zhang, Xiaodong Cun, Shaoli Huang, Yong Zhang, Hongwei Zhao, Hongtao Lu, Xi Shen• 2023

Related benchmarks

TaskDatasetResultRank
Text-to-motion generationHumanML3D (test)
FID0.116
481
text-to-motion mappingHumanML3D (test)
FID0.07
283
text-to-motion mappingKIT-ML (test)
R Precision (Top 3)0.745
275
Text-to-motion generationKIT-ML (test)
FID0.512
189
Sign Language TranslationPHOENIX-2014T (test)
BLEU-411.66
159
Sign Language TranslationHow2Sign (test)
BLEU-43.53
67
Text-to-motion generationHumanML3D
R-Precision (Top 1)49.1
64
Text-driven Motion GenerationHumanML3D (test)
R-Precision@149.7
54
Text-to-Motion SynthesisKIT-ML
R Precision Top 141.6
44
Text-to-Motion SynthesisHumanML3D
R-Precision (Top 1)67.6
43
Showing 10 of 66 rows

Other info

Code

Follow for update