Flexible Motion In-betweening with Diffusion Models

About

Motion in-betweening, a fundamental task in character animation, consists of generating motion sequences that plausibly interpolate user-provided keyframe constraints. It has long been recognized as a labor-intensive and challenging process. We investigate the potential of diffusion models in generating diverse human motions guided by keyframes. Unlike previous inbetweening methods, we propose a simple unified model capable of generating precise and diverse motions that conform to a flexible range of user-specified spatial constraints, as well as text conditioning. To this end, we propose Conditional Motion Diffusion In-betweening (CondMDI) which allows for arbitrary dense-or-sparse keyframe placement and partial keyframe constraints while generating high-quality motions that are diverse and coherent with the given keyframes. We evaluate the performance of CondMDI on the text-conditioned HumanML3D dataset and demonstrate the versatility and efficacy of diffusion models for keyframe in-betweening. We further explore the use of guidance and imputation-based approaches for inference-time keyframing and compare CondMDI against these methods.

Setareh Cohan, Guy Tevet, Daniele Reda, Xue Bin Peng, Michiel van de Panne• 2024

Related benchmarks

Task	Dataset	Result
Human Motion Editing	HumanML3D (test)	FID0.247	15
Motion In-betweening	350k dataset	FPS1.93e+3	13
Temporal Inpainting (Backcasting)	HumanML3D	MPJPE2.72	10
Temporal Inpainting (Prediction)	HumanML3D	MPJPE10	10
Geometric-Constrained Motion Generation	Geometric-Constrained Generation	Trajectory Error70.46	8
Motion Generation	HumanML3D	MMD0.1101	7
Motion Generation	Bones-70k	MMD0.1062	7
Motion Generation	LaFAN1 G1	MMD0.286	7
Human Motion Completion	HumanML3D Middle-half completion (test)	FID0.594	6
Human Motion Completion	HumanML3D First-half completion (test)	FID0.62	6

Showing 10 of 12 rows

Other info

Follow for update

@wizwand_team Discord