Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Bottom-Up Human Pose Estimation Via Disentangled Keypoint Regression

About

In this paper, we are interested in the bottom-up paradigm of estimating human poses from an image. We study the dense keypoint regression framework that is previously inferior to the keypoint detection and grouping framework. Our motivation is that regressing keypoint positions accurately needs to learn representations that focus on the keypoint regions. We present a simple yet effective approach, named disentangled keypoint regression (DEKR). We adopt adaptive convolutions through pixel-wise spatial transformer to activate the pixels in the keypoint regions and accordingly learn representations from them. We use a multi-branch structure for separate regression: each branch learns a representation with dedicated adaptive convolutions and regresses one keypoint. The resulting disentangled representations are able to attend to the keypoint regions, respectively, and thus the keypoint regression is spatially more accurate. We empirically show that the proposed direct regression method outperforms keypoint detection and grouping methods and achieves superior bottom-up pose estimation results on two benchmark datasets, COCO and CrowdPose. The code and models are available at https://github.com/HRNet/DEKR.

Zigang Geng, Ke Sun, Bin Xiao, Zhaoxiang Zhang, Jingdong Wang• 2021

Related benchmarks

TaskDatasetResultRank
Human Pose EstimationCOCO (test-dev)
AP71
408
2D Human Pose EstimationCOCO 2017 (val)
AP72.1
386
Pose EstimationCOCO (val)
AP72.3
319
Human Pose EstimationCOCO 2017 (test-dev)
AP71
180
Multi-person Pose EstimationCrowdPose (test)
AP68
177
Multi-person Pose EstimationCOCO (test-dev)
AP71
101
Pose EstimationOCHuman (test)
AP38.2
95
Multi-person Pose EstimationOCHuman (val)
AP38.8
40
Pose EstimationOCHuman (val)
AP37.9
24
Human Pose EstimationExLPose-OCN (test)
AP@0.5:0.95 (A7M3)27.1
23
Showing 10 of 18 rows

Other info

Code

Follow for update