Graph Stacked Hourglass Networks for 3D Human Pose Estimation

About

In this paper, we propose a novel graph convolutional network architecture, Graph Stacked Hourglass Networks, for 2D-to-3D human pose estimation tasks. The proposed architecture consists of repeated encoder-decoder, in which graph-structured features are processed across three different scales of human skeletal representations. This multi-scale architecture enables the model to learn both local and global feature representations, which are critical for 3D human pose estimation. We also introduce a multi-level feature learning approach using different-depth intermediate features and show the performance improvements that result from exploiting multi-scale, multi-level feature representations. Extensive experiments are conducted to validate our approach, and the results show that our model outperforms the state-of-the-art.

Tianhan Xu, Wataru Takano• 2021

Related benchmarks

Task	Dataset	Result
3D Human Pose Estimation	MPI-INF-3DHP (test)	PCK80.1	606
3D Human Pose Estimation	Human3.6M (test)	MPJPE (Average)35.8	570
3D Human Pose Estimation	Human3.6M (Protocol #1)	MPJPE (Avg.)35.8	457
3D Human Pose Estimation	Human3.6M	MPJPE51.9	197
3D Human Pose Estimation	Human3.6M Protocol 1 (test)	Dir. Error (Protocol 1)35.8	183
3D Human Pose Estimation	Human3.6M (S9, S11)	Average Error (MPJPE Avg)51.9	94
3D Human Pose Estimation	Human3.6M S9 and S11 (test)	Dir. Error35.8	72
3D Pose Estimation	Human3.6M	--	66
3D Human Pose Estimation	Human3.6M v1 (test)	Avg Performance35.8	58
3D Human Pose Estimation	Human3.6M GT 2D pose sequences (test)	MPJPE (Dire.)35.8	29

Showing 10 of 11 rows

Other info

Follow for update

@wizwand_team Discord