MotionLCM: Real-time Controllable Motion Generation via Latent Consistency Model

About

This work introduces MotionLCM, extending controllable motion generation to a real-time level. Existing methods for spatial-temporal control in text-conditioned motion generation suffer from significant runtime inefficiency. To address this issue, we first propose the motion latent consistency model (MotionLCM) for motion generation, building on the motion latent diffusion model. By adopting one-step (or few-step) inference, we further improve the runtime efficiency of the motion latent diffusion model for motion generation. To ensure effective controllability, we incorporate a motion ControlNet within the latent space of MotionLCM and enable explicit control signals (i.e., initial motions) in the vanilla motion space to further provide supervision for the training process. By employing these techniques, our approach can generate human motions with text and control signals in real-time. Experimental results demonstrate the remarkable generation and controlling capabilities of MotionLCM while maintaining real-time runtime efficiency.

Wenxun Dai, Ling-Hao Chen, Jingbo Wang, Jinpeng Liu, Bo Dai, Yansong Tang• 2024

Related benchmarks

Task	Dataset	Result
Text-to-motion generation	HumanML3D (test)	FID0.049	553
text-to-motion mapping	HumanML3D (test)	FID0.467	283
Text-to-motion generation	HumanML3D	FID0.467	91
Motion Control	HumanML3D (test)	Average Error0.1092	65
Text-to-Motion Synthesis	KIT-ML	R Precision Top 376.9	58
Text-to-motion generation	HumanML3D 1 (test)	R-Precision (Top 1)0.502	32
Human Motion Generation	HumanML3D (test)	FID0.444	27
Text-to-Motion Generation (Kinematic Representation)	HumanML3D Kinematic Representation (test)	R-Precision@10.502	19
Text-to-motion	Motion-X	R TOP165.8	17
Motion Generation	MBench 16 (official leaderboard)	Jitter Penalty0.022	17

Showing 10 of 14 rows

Other info

Code

Follow for update

@wizwand_team Discord