Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Text2Gestures: A Transformer-Based Network for Generating Emotive Body Gestures for Virtual Agents

About

We present Text2Gestures, a transformer-based learning method to interactively generate emotive full-body gestures for virtual agents aligned with natural language text inputs. Our method generates emotionally expressive gestures by utilizing the relevant biomechanical features for body expressions, also known as affective features. We also consider the intended task corresponding to the text and the target virtual agents' intended gender and handedness in our generation pipeline. We train and evaluate our network on the MPI Emotional Body Expressions Database and observe that our network produces state-of-the-art performance in generating gestures for virtual agents aligned with the text for narration or conversation. Our network can generate these gestures at interactive rates on a commodity GPU. We conduct a web-based user study and observe that around 91% of participants indicated our generated gestures to be at least plausible on a five-point Likert Scale. The emotions perceived by the participants from the gestures are also strongly positively correlated with the corresponding intended emotions, with a minimum Pearson coefficient of 0.77 in the valence dimension.

Uttaran Bhattacharya, Nicholas Rewkowski, Abhishek Banerjee, Pooja Guhan, Aniket Bera, Dinesh Manocha• 2021

Related benchmarks

TaskDatasetResultRank
Text-to-motion generationHumanML3D (test)
FID7.664
331
text-to-motion mappingKIT-ML (test)
R Precision (Top 3)0.338
275
text-to-motion mappingHumanML3D (test)
FID5.012
243
Text-to-Motion SynthesisHumanML3D--
43
Text-to-motion generationKIT (test)
R-Precision Top-115.6
14
Text-to-Motion SynthesisKIT-ML
R Precision Top 115.6
10
Showing 6 of 6 rows

Other info

Follow for update