Rethinking on Multi-Stage Networks for Human Pose Estimation
About
Existing pose estimation approaches fall into two categories: single-stage and multi-stage methods. While multi-stage methods are seemingly more suited for the task, their performance in current practice is not as good as single-stage methods. This work studies this issue. We argue that the current multi-stage methods' unsatisfactory performance comes from the insufficiency in various design choices. We propose several improvements, including the single-stage module design, cross stage feature aggregation, and coarse-to-fine supervision. The resulting method establishes the new state-of-the-art on both MS COCO and MPII Human Pose dataset, justifying the effectiveness of a multi-stage architecture. The source code is publicly available for further research.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Human Pose Estimation | COCO (test-dev) | AP78.1 | 408 | |
| 2D Human Pose Estimation | COCO 2017 (val) | AP75.9 | 386 | |
| Human Pose Estimation | MPII (test) | Shoulder PCK97.1 | 314 | |
| Multi-person Pose Estimation | COCO (test-dev) | AP77.1 | 101 | |
| Multi-person Pose Estimation | COCO 2017 (test-dev) | AP78.1 | 99 | |
| Keypoint Detection | MS COCO 2017 (test-dev) | AP78.1 | 43 | |
| Human Keypoint Detection | COCO | AP78.1 | 30 | |
| Pose Estimation | COCO (test) | AP76.1 | 28 | |
| Human Keypoint Detection | MS COCO (test-dev) | AP78.1 | 19 | |
| Pose Estimation | COCO 2017 (test-challenge) | AP76.4 | 6 |