HuMoR: 3D Human Motion Model for Robust Pose Estimation
About
We introduce HuMoR: a 3D Human Motion Model for Robust Estimation of temporal pose and shape. Though substantial progress has been made in estimating 3D human motion and shape from dynamic observations, recovering plausible pose sequences in the presence of noise and occlusions remains a challenge. For this purpose, we propose an expressive generative model in the form of a conditional variational autoencoder, which learns a distribution of the change in pose at each step of a motion sequence. Furthermore, we introduce a flexible optimization-based approach that leverages HuMoR as a motion prior to robustly estimate plausible pose and shape from ambiguous observations. Through extensive evaluations, we demonstrate that our model generalizes to diverse motions and body shapes after training on a large motion capture dataset, and enables motion reconstruction from multiple input modalities including 3D keypoints and RGB(-D) videos.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| 3D Human Pose Estimation | 3DPW (test) | -- | 505 | |
| Motion Completion | HumanML3D (test) | MPJPE7.2 | 40 | |
| 3D Human-Scene Contact Estimation | RICH (test) | Precision24.8 | 11 | |
| Surface Reconstruction | AMA samba | CD9.8 | 10 | |
| Surface Reconstruction | AMA bouncing | CD11.5 | 10 | |
| Body estimation | AMASS (test) | MPJPE (mm)199.5 | 8 | |
| Body estimation | RICH | MPJPE319.8 | 8 | |
| Body estimation | Aria Digital Twins | MPJPE284.9 | 8 | |
| 3D Human Pose Reconstruction | AMASS (Setting S2) | MPJPE5.5 | 7 | |
| Human Pose Estimation | PROX | ACD311.5 | 7 |