Graph Stacked Hourglass Networks for 3D Human Pose Estimation
About
In this paper, we propose a novel graph convolutional network architecture, Graph Stacked Hourglass Networks, for 2D-to-3D human pose estimation tasks. The proposed architecture consists of repeated encoder-decoder, in which graph-structured features are processed across three different scales of human skeletal representations. This multi-scale architecture enables the model to learn both local and global feature representations, which are critical for 3D human pose estimation. We also introduce a multi-level feature learning approach using different-depth intermediate features and show the performance improvements that result from exploiting multi-scale, multi-level feature representations. Extensive experiments are conducted to validate our approach, and the results show that our model outperforms the state-of-the-art.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| 3D Human Pose Estimation | MPI-INF-3DHP (test) | PCK80.1 | 559 | |
| 3D Human Pose Estimation | Human3.6M (test) | MPJPE (Average)35.8 | 547 | |
| 3D Human Pose Estimation | Human3.6M (Protocol #1) | MPJPE (Avg.)35.8 | 440 | |
| 3D Human Pose Estimation | Human3.6M Protocol 1 (test) | Dir. Error (Protocol 1)35.8 | 183 | |
| 3D Human Pose Estimation | Human3.6M | MPJPE51.9 | 160 | |
| 3D Human Pose Estimation | Human3.6M (S9, S11) | Average Error (MPJPE Avg)51.9 | 94 | |
| 3D Human Pose Estimation | Human3.6M S9 and S11 (test) | Dir. Error35.8 | 72 | |
| 3D Pose Estimation | Human3.6M | -- | 66 | |
| 3D Human Pose Estimation | Human3.6M v1 (test) | Avg Performance35.8 | 58 | |
| 3D Human Pose Estimation | Human3.6M GT 2D pose sequences (test) | MPJPE (Dire.)35.8 | 29 |