DiffSHEG: A Diffusion-Based Approach for Real-Time Speech-driven Holistic 3D Expression and Gesture Generation

About

We propose DiffSHEG, a Diffusion-based approach for Speech-driven Holistic 3D Expression and Gesture generation with arbitrary length. While previous works focused on co-speech gesture or expression generation individually, the joint generation of synchronized expressions and gestures remains barely explored. To address this, our diffusion-based co-speech motion generation transformer enables uni-directional information flow from expression to gesture, facilitating improved matching of joint expression-gesture distributions. Furthermore, we introduce an outpainting-based sampling strategy for arbitrary long sequence generation in diffusion models, offering flexibility and computational efficiency. Our method provides a practical solution that produces high-quality synchronized expression and gesture generation driven by speech. Evaluated on two public datasets, our approach achieves state-of-the-art performance both quantitatively and qualitatively. Additionally, a user study confirms the superiority of DiffSHEG over prior approaches. By enabling the real-time generation of expressive and synchronized motions, DiffSHEG showcases its potential for various applications in the development of digital humans and embodied agents.

Junming Chen, Yunfei Liu, Jianan Wang, Ailing Zeng, Yu Li, Qifeng Chen• 2024

Related benchmarks

Task	Dataset	Result
Co-speech 3D Gesture Synthesis	BEAT2 (test)	FGD8.986	27
Gesture Generation	BEAT-2 (test)	BC0.743	22
Speech-driven gesture generation	BEAT2 (Seen Speakers)	FGD0.899	18
Gesture Generation	BEAT2	FGD8.986	17
Co-speech gesture generation	BEAT	FGD7.141	13
Co-speech gesture generation	BEAT One Speaker v2 (Speaker 2)	FGD8.986	12
Gesture Generation	BEAT (test)	BC74.3	12
Holistic Motion Generation	BEAT2	FGD8.986	12
Gesture Synchronization	BEAT2 (test)	FGD10.51	11
3D Gesture Motion Generation	BEAT-X	BC0.743	10

Showing 10 of 19 rows

Other info

Code

Follow for update

@wizwand_team Discord