Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

TokenPose: Learning Keypoint Tokens for Human Pose Estimation

About

Human pose estimation deeply relies on visual clues and anatomical constraints between parts to locate keypoints. Most existing CNN-based methods do well in visual representation, however, lacking in the ability to explicitly learn the constraint relationships between keypoints. In this paper, we propose a novel approach based on Token representation for human Pose estimation~(TokenPose). In detail, each keypoint is explicitly embedded as a token to simultaneously learn constraint relationships and appearance cues from images. Extensive experiments show that the small and large TokenPose models are on par with state-of-the-art CNN-based counterparts while being more lightweight. Specifically, our TokenPose-S and TokenPose-L achieve $72.5$ AP and $75.8$ AP on COCO validation dataset respectively, with significant reduction in parameters ($\downarrow80.6\%$; $\downarrow$ $56.8\%$) and GFLOPs ($\downarrow$ $75.3\%$; $\downarrow$ $24.7\%$). Code is publicly available.

Yanjie Li, Shoukui Zhang, Zhicheng Wang, Sen Yang, Wankou Yang, Shu-Tao Xia, Erjin Zhou• 2021

Related benchmarks

TaskDatasetResultRank
Human Pose EstimationCOCO (test-dev)
AP75.9
408
2D Human Pose EstimationCOCO 2017 (val)
AP75.8
386
Pose EstimationCOCO (val)
AP75.9
319
Human Pose EstimationCOCO 2017 (test-dev)
AP75.9
180
2D Human Pose EstimationMPII (val)
Head97.1
61
Keypoint DetectionCOCO (val)
AP75.8
60
Pose EstimationCOCO
mAP75.8
30
Human Pose EstimationCOCO 2014 (val)
AP75.8
18
Animal Pose EstimationAP-10K (val)
AP72.7
17
Human Pose Estimationinfant pose estimation dataset (test)
AP93
6
Showing 10 of 10 rows

Other info

Follow for update