Digital Life Project: Autonomous 3D Characters with Social Intelligence

About

In this work, we present Digital Life Project, a framework utilizing language as the universal medium to build autonomous 3D characters, who are capable of engaging in social interactions and expressing with articulated body motions, thereby simulating life in a digital environment. Our framework comprises two primary components: 1) SocioMind: a meticulously crafted digital brain that models personalities with systematic few-shot exemplars, incorporates a reflection process based on psychology principles, and emulates autonomy by initiating dialogue topics; 2) MoMat-MoGen: a text-driven motion synthesis paradigm for controlling the character's digital body. It integrates motion matching, a proven industry technique to ensure motion quality, with cutting-edge advancements in motion generation for diversity. Extensive experiments demonstrate that each module achieves state-of-the-art performance in its respective domain. Collectively, they enable virtual characters to initiate and sustain dialogues autonomously, while evolving their socio-psychological states. Concurrently, these characters can perform contextually relevant bodily movements. Additionally, a motion captioning module further allows the virtual character to recognize and appropriately respond to human players' actions. Homepage: https://digital-life-project.com/

Zhongang Cai, Jianping Jiang, Zhongfei Qing, Xinying Guo, Mingyuan Zhang, Zhengyu Lin, Haiyi Mei, Chen Wei, Ruisi Wang, Wanqi Yin, Xiangyu Fan, Han Du, Liang Pan, Peng Gao, Zhitao Yang, Yang Gao, Jiaqi Li, Tianxiang Ren, Yukun Wei, Xiaogang Wang, Chen Change Loy, Lei Yang, Ziwei Liu• 2023

Related benchmarks

Task	Dataset	Result
Interactive Motion Synthesis	InterHuman (test)	R Precision (Top 1)44.9	37
text-conditioned human interaction generation	InterHuman (test)	R Precision (Top 3)66.6	36
Motion Description	HumanML3D (test)	BLEU-151.1	27
Human-human interaction motion generation	InterHuman	FID5.674	23
Text-to-motion	HumanML3D 272-dim (test)	R-Precision Top 144.9	14
Multimodal Dialogue Generation	SynMSI (test)	Context Relevance3.577	13
motion-to-text translation	KIT-ML (test)	BLEU@153.88	10
Human Motion Generation	InterHuman (test)	R@Top366.6	10
Interactive Motion Synthesis	DLP (test)	R Precision Top 151.7	6

Showing 9 of 9 rows

Other info

Code

Follow for update

@wizwand_team Discord