Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Digital Life Project: Autonomous 3D Characters with Social Intelligence

About

In this work, we present Digital Life Project, a framework utilizing language as the universal medium to build autonomous 3D characters, who are capable of engaging in social interactions and expressing with articulated body motions, thereby simulating life in a digital environment. Our framework comprises two primary components: 1) SocioMind: a meticulously crafted digital brain that models personalities with systematic few-shot exemplars, incorporates a reflection process based on psychology principles, and emulates autonomy by initiating dialogue topics; 2) MoMat-MoGen: a text-driven motion synthesis paradigm for controlling the character's digital body. It integrates motion matching, a proven industry technique to ensure motion quality, with cutting-edge advancements in motion generation for diversity. Extensive experiments demonstrate that each module achieves state-of-the-art performance in its respective domain. Collectively, they enable virtual characters to initiate and sustain dialogues autonomously, while evolving their socio-psychological states. Concurrently, these characters can perform contextually relevant bodily movements. Additionally, a motion captioning module further allows the virtual character to recognize and appropriately respond to human players' actions. Homepage: https://digital-life-project.com/

Zhongang Cai, Jianping Jiang, Zhongfei Qing, Xinying Guo, Mingyuan Zhang, Zhengyu Lin, Haiyi Mei, Chen Wei, Ruisi Wang, Wanqi Yin, Xiangyu Fan, Han Du, Liang Pan, Peng Gao, Zhitao Yang, Yang Gao, Jiaqi Li, Tianxiang Ren, Yukun Wei, Xiaogang Wang, Chen Change Loy, Lei Yang, Ziwei Liu• 2023

Related benchmarks

TaskDatasetResultRank
Motion DescriptionHumanML3D (test)
BLEU-151.1
27
Interactive Motion SynthesisInterHuman (test)
R Precision (Top 1)44.9
25
Human-human interaction motion generationInterHuman
FID5.674
23
Multimodal Dialogue GenerationSynMSI (test)
Context Relevance3.577
13
text-conditioned human interaction generationInterHuman (test)
R Precision (Top 1)44.9
12
motion-to-text translationKIT-ML (test)
BLEU@153.88
10
Human Motion GenerationInterHuman (test)
R@Top366.6
10
Interactive Motion SynthesisDLP (test)
R Precision Top 151.7
6
Showing 8 of 8 rows

Other info

Code

Follow for update