Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Deep High-Resolution Representation Learning for Human Pose Estimation

About

This is an official pytorch implementation of Deep High-Resolution Representation Learning for Human Pose Estimation. In this work, we are interested in the human pose estimation problem with a focus on learning reliable high-resolution representations. Most existing methods recover high-resolution representations from low-resolution representations produced by a high-to-low resolution network. Instead, our proposed network maintains high-resolution representations through the whole process. We start from a high-resolution subnetwork as the first stage, gradually add high-to-low resolution subnetworks one by one to form more stages, and connect the mutli-resolution subnetworks in parallel. We conduct repeated multi-scale fusions such that each of the high-to-low resolution representations receives information from other parallel representations over and over, leading to rich high-resolution representations. As a result, the predicted keypoint heatmap is potentially more accurate and spatially more precise. We empirically demonstrate the effectiveness of our network through the superior pose estimation results over two benchmark datasets: the COCO keypoint detection dataset and the MPII Human Pose dataset. The code and models have been publicly available at \url{https://github.com/leoxiaobin/deep-high-resolution-net.pytorch}.

Ke Sun, Bin Xiao, Dong Liu, Jingdong Wang• 2019

Related benchmarks

TaskDatasetResultRank
Image ClassificationImageNet-1K 1.0 (val)
Top-1 Accuracy79.5
1866
Image ClassificationImageNet-1k (val)
Top-1 Acc79.3
706
3D Human Pose EstimationHuman3.6M (Protocol #1)
MPJPE (Avg.)53.2
440
Human Pose EstimationCOCO (test-dev)
AP77
408
2D Human Pose EstimationCOCO 2017 (val)
AP77.4
386
Pose EstimationCOCO (val)
AP78.1
319
Human Pose EstimationMPII (test)
Shoulder PCK96.9
314
Human Pose EstimationCOCO 2017 (test-dev)
AP77
180
Multi-person Pose EstimationCrowdPose (test)
AP72.8
177
Facial Landmark Detection300-W (Fullset)
Mean Error (%)3.34
174
Showing 10 of 116 rows
...

Other info

Code

Follow for update