Graph-Based 3D Multi-Person Pose Estimation Using Multi-View Images
About
This paper studies the task of estimating the 3D human poses of multiple persons from multiple calibrated camera views. Following the top-down paradigm, we decompose the task into two stages, i.e. person localization and pose estimation. Both stages are processed in coarse-to-fine manners. And we propose three task-specific graph neural networks for effective message passing. For 3D person localization, we first use Multi-view Matching Graph Module (MMG) to learn the cross-view association and recover coarse human proposals. The Center Refinement Graph Module (CRG) further refines the results via flexible point-based prediction. For 3D pose estimation, the Pose Regression Graph Module (PRG) learns both the multi-view geometry and structural relations between human joints. Our approach achieves state-of-the-art performance on CMU Panoptic and Shelf datasets with significantly lower computation complexity.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| 3D Human Pose Estimation | Campus | PCP2.4 | 36 | |
| Multi-person 3D Pose Estimation | Shelf dataset | Actor 1 Score99.3 | 27 | |
| 3D Pose Estimation | shelf | PCP Actor 199.3 | 25 | |
| Multi-person 3D Human Pose Estimation | CMU Panoptic (test) | MPJPE (Average)15.84 | 22 | |
| 3D Multi-person Pose Estimation | MVOR 23 (test) | MPJPE (mm)120 | 16 | |
| 3D Human Pose Estimation | CMU Panoptic JLT+15 (test) | MPJPE15.63 | 14 | |
| 3D Human Pose Estimation | Human3.6M (S9) | PCP82.2 | 14 | |
| 3D Human Pose Estimation | Chi3D | Invalid Rate1.02e+3 | 14 | |
| Multi-person 3D Pose Estimation | Shelf (transfer) | PCP98.8 | 13 | |
| 3D Multi-person Pose Estimation | Panoptic (test) | PCP99.5 | 12 |