Human Pose Estimation using Deep Consensus Voting
About
In this paper we consider the problem of human pose estimation from a single still image. We propose a novel approach where each location in the image votes for the position of each keypoint using a convolutional neural net. The voting scheme allows us to utilize information from the whole image, rather than rely on a sparse set of keypoint locations. Using dense, multi-target votes, not only produces good keypoint predictions, but also enables us to compute image-dependent joint keypoint probabilities by looking at consensus voting. This differs from most previous methods where joint probabilities are learned from relative keypoint locations and are independent of the image. We finally combine the keypoints votes and joint probabilities in order to identify the optimal pose configuration. We show our competitive performance on the MPII Human Pose and Leeds Sports Pose datasets.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Human Pose Estimation | MPII (test) | Shoulder PCK93.3 | 314 | |
| Human Pose Estimation | LSP (test) | Head Accuracy97.8 | 102 | |
| Human Pose Estimation | MPII | Head Accuracy97.8 | 32 | |
| Articulated Human Pose Estimation | LSP (test) | Upper Arms Accuracy80.4 | 28 | |
| Human Pose Estimation | LSP PC annotations (test) | Torso Accuracy0.973 | 16 | |
| Human Pose Estimation | LSP person-centric (test) | Head Accuracy96.8 | 9 | |
| Human Pose Estimation | LSP extended (test) | -- | 8 |