Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Seamless Human Motion Composition with Blended Positional Encodings

About

Conditional human motion generation is an important topic with many applications in virtual reality, gaming, and robotics. While prior works have focused on generating motion guided by text, music, or scenes, these typically result in isolated motions confined to short durations. Instead, we address the generation of long, continuous sequences guided by a series of varying textual descriptions. In this context, we introduce FlowMDM, the first diffusion-based model that generates seamless Human Motion Compositions (HMC) without any postprocessing or redundant denoising steps. For this, we introduce the Blended Positional Encodings, a technique that leverages both absolute and relative positional encodings in the denoising chain. More specifically, global motion coherence is recovered at the absolute stage, whereas smooth and realistic transitions are built at the relative stage. As a result, we achieve state-of-the-art results in terms of accuracy, realism, and smoothness on the Babel and HumanML3D datasets. FlowMDM excels when trained with only a single description per motion sequence thanks to its Pose-Centric Cross-ATtention, which makes it robust against varying text descriptions at inference time. Finally, to address the limitations of existing HMC metrics, we propose two new metrics: the Peak Jerk and the Area Under the Jerk, to detect abrupt transitions.

German Barquero, Sergio Escalera, Cristina Palmero• 2024

Related benchmarks

TaskDatasetResultRank
Human Motion CompositionBABEL
PJ0.06
13
Pose-conditioned motion generationMotionHub
R-P T30.288
10
Pose-conditioned motion generationHumanML3D
R-Precision (Top 3)0.453
10
Human Motion CompositionHumanML3D Subsequence (test)
R-precision68.5
6
Human Motion CompositionHumanML3D Transition (test)
FID1.38
6
Dexterous Hand Manipulation Sequence GenerationDexYCB (seen)
Diversity Score61.25
6
Hand Manipulation Sequence GenerationOakInk (Seen)
Diversity189.5
6
Dexterous Hand Manipulation Sequence GenerationDexYCB (unseen)
MPJPE86.13
5
Hand Manipulation Sequence GenerationOakInk (Unseen)
MPJPE65.39
5
Sequential action generationBABEL
R@346
5
Showing 10 of 10 rows

Other info

Code

Follow for update