FreeMotion: A Unified Framework for Number-free Text-to-Motion Synthesis

About

Text-to-motion synthesis is a crucial task in computer vision. Existing methods are limited in their universality, as they are tailored for single-person or two-person scenarios and can not be applied to generate motions for more individuals. To achieve the number-free motion synthesis, this paper reconsiders motion generation and proposes to unify the single and multi-person motion by the conditional motion distribution. Furthermore, a generation module and an interaction module are designed for our FreeMotion framework to decouple the process of conditional motion generation and finally support the number-free motion synthesis. Besides, based on our framework, the current single-person motion spatial control method could be seamlessly integrated, achieving precise control of multi-person motion. Extensive experiments demonstrate the superior performance of our method and our capability to infer single and multi-human motions simultaneously.

Ke Fan, Junshu Tang, Weijian Cao, Ran Yi, Moran Li, Jingyu Gong, Jiangning Zhang, Yabiao Wang, Chengjie Wang, Lizhuang Ma• 2024

Related benchmarks

Task	Dataset	Result
Interactive Motion Synthesis	InterHuman (test)	R Precision (Top 1)32.6	37
Human action-reaction synthesis	InterHuman-AS (test)	RTop340.9	9
Human-Human Interaction	Inter-X	FID0.492	7
Motion Generation	Inter-X	FID0.492	6
Text-to-motion generation	InterHuman re-annotated single-person text (test)	R-Precision Top 126.4	3

Showing 5 of 5 rows

Other info

Code

Follow for update

@wizwand_team Discord