Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

DiffuseStyleGesture: Stylized Audio-Driven Co-Speech Gesture Generation with Diffusion Models

About

The art of communication beyond speech there are gestures. The automatic co-speech gesture generation draws much attention in computer animation. It is a challenging task due to the diversity of gestures and the difficulty of matching the rhythm and semantics of the gesture to the corresponding speech. To address these problems, we present DiffuseStyleGesture, a diffusion model based speech-driven gesture generation approach. It generates high-quality, speech-matched, stylized, and diverse co-speech gestures based on given speeches of arbitrary length. Specifically, we introduce cross-local attention and self-attention to the gesture diffusion pipeline to generate better speech matched and realistic gestures. We then train our model with classifier-free guidance to control the gesture style by interpolation or extrapolation. Additionally, we improve the diversity of generated gestures with different initial gestures and noise. Extensive experiments show that our method outperforms recent approaches on speech-driven gesture generation. Our code, pre-trained models, and demos are available at https://github.com/YoungSeng/DiffuseStyleGesture.

Sicheng Yang, Zhiyong Wu, Minglei Li, Zhensong Zhang, Lei Hao, Weihong Bao, Ming Cheng, Long Xiao• 2023

Related benchmarks

TaskDatasetResultRank
Co-speech gesture generationBEAT All Speakers 2
BC7.241
31
Co-speech 3D Gesture SynthesisBEAT2 (test)
FGD6.644
27
Gesture GenerationBEAT-2 (test)
BC7.241
22
Co-speech gesture generationBEAT-2 (1 Speaker)
BC7.241
17
Gesture GenerationBEAT2
FGD8.811
17
Co-speech motion generationBEATX (test)
FGD10.137
16
Gesture SynchronizationBEAT2 (test)
FGD8.811
11
Co-speech gesture generationBEATX Standard (test)
FGD10.137
11
Speech-driven gesture generationBEAT-X
FGD8.811
11
Speech-driven Holistic Expression and Gesture GenerationBEAT 2022 (test)
FMD1.26e+3
9
Showing 10 of 18 rows

Other info

Follow for update