TokenHMR: Advancing Human Mesh Recovery with a Tokenized Pose Representation
About
We address the problem of regressing 3D human pose and shape from a single image, with a focus on 3D accuracy. The current best methods leverage large datasets of 3D pseudo-ground-truth (p-GT) and 2D keypoints, leading to robust performance. With such methods, we observe a paradoxical decline in 3D pose accuracy with increasing 2D accuracy. This is caused by biases in the p-GT and the use of an approximate camera projection model. We quantify the error induced by current camera models and show that fitting 2D keypoints and p-GT accurately causes incorrect 3D poses. Our analysis defines the invalid distances within which minimizing 2D and p-GT losses is detrimental. We use this to formulate a new loss Threshold-Adaptive Loss Scaling (TALS) that penalizes gross 2D and p-GT losses but not smaller ones. With such a loss, there are many 3D poses that could equally explain the 2D evidence. To reduce this ambiguity we need a prior over valid human poses but such priors can introduce unwanted bias. To address this, we exploit a tokenized representation of human pose and reformulate the problem as token prediction. This restricts the estimated poses to the space of valid poses, effectively providing a uniform prior. Extensive experiments on the EMDB and 3DPW datasets show that our reformulated keypoint loss and tokenization allows us to train on in-the-wild data while improving 3D accuracy over the state-of-the-art. Our models and code are available for research at https://tokenhmr.is.tue.mpg.de.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| 3D Human Pose Estimation | 3DPW (test) | PA-MPJPE44.3 | 505 | |
| 3D Human Mesh Recovery | 3DPW (test) | PA-MPJPE47.5 | 264 | |
| 3D Human Mesh Recovery | Human3.6M (test) | PA-MPJPE36.3 | 120 | |
| 3D Human Mesh Recovery | 3DPW | PA-MPJPE43.7 | 72 | |
| Human Mesh Reconstruction | Human3.6M | PA-MPJPE36.3 | 50 | |
| Human Mesh Reconstruction | 3DPW 14 joints (test) | PA-MPJPE44.3 | 26 | |
| 3D Human Pose and Mesh Estimation | 3DPW (test) | PA-MPJPE43.8 | 24 | |
| Human Mesh Reconstruction | EMDB 24 joints (test) | PA-MPJPE55.6 | 21 | |
| Human Mesh Recovery | EMDB (test) | PA-MPJPE55.6 | 19 | |
| 3D Human Pose and Shape Recovery | EMDB 1 | MPJPE88.1 | 18 |