Convolutional Pose Machines

About

Pose Machines provide a sequential prediction framework for learning rich implicit spatial models. In this work we show a systematic design for how convolutional networks can be incorporated into the pose machine framework for learning image features and image-dependent spatial models for the task of pose estimation. The contribution of this paper is to implicitly model long-range dependencies between variables in structured prediction tasks such as articulated pose estimation. We achieve this by designing a sequential architecture composed of convolutional networks that directly operate on belief maps from previous stages, producing increasingly refined estimates for part locations, without the need for explicit graphical model-style inference. Our approach addresses the characteristic difficulty of vanishing gradients during training by providing a natural learning objective function that enforces intermediate supervision, thereby replenishing back-propagated gradients and conditioning the learning procedure. We demonstrate state-of-the-art performance and outperform competing methods on standard benchmarks including the MPII, LSP, and FLIC datasets.

Shih-En Wei, Varun Ramakrishna, Takeo Kanade, Yaser Sheikh• 2016

Related benchmarks

Task	Dataset	Result
Human Pose Estimation	MPII (test)	Shoulder PCK95	350
Human Pose Estimation	LSP (test)	Head Accuracy97.8	102
2D Human Pose Estimation	MPII (val)	Head96.2	61
Human Pose Estimation	J-HMDB sub	Head Accuracy98.4	49
3D Pose Estimation	Total Capture (test)	Mean MPJPE99	46
Human Pose Estimation	MPII	Head Accuracy97.8	32
Pose Estimation	Penn Action Dataset (test)	Head98.6	19
Human Pose Estimation	LSP PC annotations (test)	Torso Accuracy0.98	16
Multi-person Pose Estimation	Multi-Person PoseTrack	Head Accuracy0.488	15
Human Pose Estimation	MPII pose 03/15/2018 (full)	Head Accuracy97.8	11

Showing 10 of 21 rows

Other info

Follow for update

@wizwand_team Discord