Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

GDPO-Listener: Expressive Interactive Head Generation via Auto-Regressive Flow Matching and Group reward-Decoupled Policy Optimization

About

Generating realistic 3D head motion for dyadic interactions is a significant challenge in virtual human synthesis. While recent methods achieve impressive results with speaking heads, they frequently suffer from the `Regression-to-the-Mean' problem in listener motions, collapsing into static faces, and lack the parameter space for complex nonverbal motions. In this paper, we propose GDPO-Listener, a novel framework that achieves highly expressive speaking and listening motion generation. First, we introduce an Auto-Regressive Flow Matching architecture enabling stable supervised learning. Second, to overcome kinematic stillness, we apply the Group reward-Decoupled Policy Optimization (GDPO). By isolating reward normalization across distinct FLAME parameter groups, GDPO explicitly incentivizes high variance expressive generations. Finally, we enable explicit semantic text control for customizable responses. Extensive evaluations across the Seamless Interaction and DualTalk datasets demonstrate superior performance compared to existing baselines on long-term kinematic variance, visual expressivity and semantic controllability.

Zhangyu Jin, Maksim Siniukov, Deuksin Kwon, Ashutosh Chaubey, Mohammad Soleymani• 2026

Related benchmarks

TaskDatasetResultRank
Speaking facial motion generationSeamless Interaction (test)
LVE3.08
13
Listening facial motion generationSeamless Interaction (test)
FDD22.56
9
Speaking Head Motion GenerationSeamless Interaction Dataset
LVE2.95
6
Listening Head Motion GenerationSeamless Interaction Dataset
FDD18.85
4
Interactive Head GenerationUser Study Prolific (results)
Lip Sync Score70.4
3
Listening Head GenerationDualTalk All Listening
FDD26.95
3
Listening Head GenerationDualTalk Expressive Listening
FDD33.99
3
Showing 7 of 7 rows

Other info

Follow for update