Probabilistic Monocular 3D Human Pose Estimation with Normalizing Flows
About
3D human pose estimation from monocular images is a highly ill-posed problem due to depth ambiguities and occlusions. Nonetheless, most existing works ignore these ambiguities and only estimate a single solution. In contrast, we generate a diverse set of hypotheses that represents the full posterior distribution of feasible 3D poses. To this end, we propose a normalizing flow based method that exploits the deterministic 3D-to-2D mapping to solve the ambiguous inverse 2D-to-3D problem. Additionally, uncertain detections and occlusions are effectively modeled by incorporating uncertainty information of the 2D detector as condition. Further keys to success are a learned 3D pose prior and a generalization of the best-of-M loss. We evaluate our approach on the two benchmark datasets Human3.6M and MPI-INF-3DHP, outperforming all comparable methods in most metrics. The implementation is available on GitHub.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| 3D Human Pose Estimation | MPI-INF-3DHP (test) | PCK84.3 | 559 | |
| 3D Human Pose Estimation | Human3.6M (test) | MPJPE (Average)32.4 | 547 | |
| 3D Human Pose Estimation | Human3.6M (Protocol #1) | MPJPE (Avg.)44.3 | 440 | |
| 3D Human Pose Estimation | Human3.6M (Protocol 2) | Average MPJPE32.4 | 315 | |
| 3D Human Pose Estimation | Human3.6M S9 and S11 (test) | -- | 72 | |
| 3D Pose Estimation | Human3.6M | -- | 66 | |
| 3D Pose Estimation | 3DHP | -- | 25 | |
| 3D Human Pose Estimation | Human3.6M Standard Protocol | MPJPE44.3 | 19 | |
| 3D Human Pose Estimation | MPI-INF-3DHP | PCK (Overall)84.3 | 17 | |
| 3D Human Pose Estimation | Human 3.6M Subjects 9 & 11 (test) | MPJPE44.3 | 16 |