BoDiffusion: Diffusing Sparse Observations for Full-Body Human Motion Synthesis
About
Mixed reality applications require tracking the user's full-body motion to enable an immersive experience. However, typical head-mounted devices can only track head and hand movements, leading to a limited reconstruction of full-body motion due to variability in lower body configurations. We propose BoDiffusion -- a generative diffusion model for motion synthesis to tackle this under-constrained reconstruction problem. We present a time and space conditioning scheme that allows BoDiffusion to leverage sparse tracking inputs while generating smooth and realistic full-body motion sequences. To the best of our knowledge, this is the first approach that uses the reverse diffusion process to model full-body tracking as a conditional sequence generation task. We conduct experiments on the large-scale motion-capture dataset AMASS and show that our approach outperforms the state-of-the-art approaches by a significant margin in terms of full-body motion realism and joint reconstruction error.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Human Pose Estimation | AMASS (Protocol 1) | MPJPE5.16 | 12 | |
| Motion Tracking | Nymeria | Full Error79.8 | 8 | |
| Human Pose Estimation | AMASS Protocol 2, Upper body ×0.7 (test) | MPJPE7.61 | 8 | |
| IMU-to-Motion | LINGO 3pt (test) | MPJPE106.4 | 5 | |
| Human Pose Estimation | AMASS Protocol 1 Upper body scale 0.7 (test) | MPJPE7.44 | 4 | |
| Human Pose Estimation | AMASS Protocol 1 Arms scale 0.7 (test) | MPJPE7.83 | 4 | |
| Human Pose Estimation | AMASS Default shape (Protocol 2) | MPJPE3.59 | 4 | |
| Human Pose Estimation | AMASS Upper body ×1.4 (Protocol 2) | MPJPE17.39 | 4 | |
| Human Pose Estimation | AMASS Arms ×1.4 Torso ×0.7 (Protocol 2) | MPJPE9.99 | 4 | |
| Human Pose Estimation | AMASS Protocol 2, Arms ×1.4 (test) | MPJPE13.19 | 4 |