Digital Life Project: Autonomous 3D Characters with Social Intelligence
About
In this work, we present Digital Life Project, a framework utilizing language as the universal medium to build autonomous 3D characters, who are capable of engaging in social interactions and expressing with articulated body motions, thereby simulating life in a digital environment. Our framework comprises two primary components: 1) SocioMind: a meticulously crafted digital brain that models personalities with systematic few-shot exemplars, incorporates a reflection process based on psychology principles, and emulates autonomy by initiating dialogue topics; 2) MoMat-MoGen: a text-driven motion synthesis paradigm for controlling the character's digital body. It integrates motion matching, a proven industry technique to ensure motion quality, with cutting-edge advancements in motion generation for diversity. Extensive experiments demonstrate that each module achieves state-of-the-art performance in its respective domain. Collectively, they enable virtual characters to initiate and sustain dialogues autonomously, while evolving their socio-psychological states. Concurrently, these characters can perform contextually relevant bodily movements. Additionally, a motion captioning module further allows the virtual character to recognize and appropriately respond to human players' actions. Homepage: https://digital-life-project.com/
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Motion Description | HumanML3D (test) | BLEU-151.1 | 27 | |
| Interactive Motion Synthesis | InterHuman (test) | R Precision (Top 1)44.9 | 25 | |
| Human-human interaction motion generation | InterHuman | FID5.674 | 23 | |
| Multimodal Dialogue Generation | SynMSI (test) | Context Relevance3.577 | 13 | |
| text-conditioned human interaction generation | InterHuman (test) | R Precision (Top 1)44.9 | 12 | |
| motion-to-text translation | KIT-ML (test) | BLEU@153.88 | 10 | |
| Human Motion Generation | InterHuman (test) | R@Top366.6 | 10 | |
| Interactive Motion Synthesis | DLP (test) | R Precision Top 151.7 | 6 |