Deep Autoencoder for Combined Human Pose Estimation and body Model Upscaling
About
We present a method for simultaneously estimating 3D human pose and body shape from a sparse set of wide-baseline camera views. We train a symmetric convolutional autoencoder with a dual loss that enforces learning of a latent representation that encodes skeletal joint positions, and at the same time learns a deep representation of volumetric body shape. We harness the latter to up-scale input volumetric data by a factor of $4 \times$, whilst recovering a 3D estimate of joint positions with equal or greater accuracy than the state of the art. Inference runs in real-time (25 fps) and has the potential for passive human behaviour monitoring where there is a requirement for high fidelity estimation of human body shape and pose.
Matthew Trumble, Andrew Gilbert, Adrian Hilton, John Collomosse• 2018
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| 3D Human Pose Estimation | Human3.6M (test) | MPJPE (Average)62.5 | 547 | |
| 3D Human Pose Estimation | Human3.6M S9 and S11 (test) | Dir. Error41.7 | 72 | |
| 3D Pose Estimation | Total Capture (test) | Mean MPJPE34.1 | 42 | |
| 3D Human Pose Estimation | Human3.6M protocol 2 (val) | MPJPE (Directions)41.7 | 8 | |
| 3D Human Pose Estimation | TotalCapture (Seen Subjects (S1, S2, S3)) | W2 Error13 | 7 | |
| 3D Human Pose Estimation | TotalCapture (Unseen Subjects (S4, S5)) | W2 Error21.8 | 7 |
Showing 6 of 6 rows