Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

ReMoDiffuse: Retrieval-Augmented Motion Diffusion Model

About

3D human motion generation is crucial for creative industry. Recent advances rely on generative models with domain knowledge for text-driven motion generation, leading to substantial progress in capturing common motions. However, the performance on more diverse motions remains unsatisfactory. In this work, we propose ReMoDiffuse, a diffusion-model-based motion generation framework that integrates a retrieval mechanism to refine the denoising process. ReMoDiffuse enhances the generalizability and diversity of text-driven motion generation with three key designs: 1) Hybrid Retrieval finds appropriate references from the database in terms of both semantic and kinematic similarities. 2) Semantic-Modulated Transformer selectively absorbs retrieval knowledge, adapting to the difference between retrieved samples and the target motion sequence. 3) Condition Mixture better utilizes the retrieval database during inference, overcoming the scale sensitivity in classifier-free guidance. Extensive experiments demonstrate that ReMoDiffuse outperforms state-of-the-art methods by balancing both text-motion consistency and motion quality, especially for more diverse motion generation.

Mingyuan Zhang, Xinying Guo, Liang Pan, Zhongang Cai, Fangzhou Hong, Huirong Li, Lei Yang, Ziwei Liu• 2023

Related benchmarks

TaskDatasetResultRank
Text-to-motion generationHumanML3D (test)
FID0.103
481
text-to-motion mappingHumanML3D (test)
FID0.103
283
text-to-motion mappingKIT-ML (test)
R Precision (Top 3)0.765
275
Text-to-motion generationKIT-ML (test)
FID0.155
189
Text-to-motion generationHumanML3D
R-Precision (Top 1)51
64
Text-driven Motion GenerationHumanML3D (test)
R-Precision@151
54
Text-to-Motion SynthesisKIT-ML
R Precision Top 142.7
44
Interactive Motion SynthesisInterHuman (test)
R Precision (Top 1)44.2
37
Text-to-motion generationHumanML3D 19 (test)
FID0.103
37
Text-to-motion generationHumanML3D full dimension (test)
R-Precision Top 146.8
20
Showing 10 of 22 rows

Other info

Code

Follow for update