Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

A Single 2D Pose with Context is Worth Hundreds for 3D Human Pose Estimation

About

The dominant paradigm in 3D human pose estimation that lifts a 2D pose sequence to 3D heavily relies on long-term temporal clues (i.e., using a daunting number of video frames) for improved accuracy, which incurs performance saturation, intractable computation and the non-causal problem. This can be attributed to their inherent inability to perceive spatial context as plain 2D joint coordinates carry no visual cues. To address this issue, we propose a straightforward yet powerful solution: leveraging the readily available intermediate visual representations produced by off-the-shelf (pre-trained) 2D pose detectors -- no finetuning on the 3D task is even needed. The key observation is that, while the pose detector learns to localize 2D joints, such representations (e.g., feature maps) implicitly encode the joint-centric spatial context thanks to the regional operations in backbone networks. We design a simple baseline named Context-Aware PoseFormer to showcase its effectiveness. Without access to any temporal information, the proposed method significantly outperforms its context-agnostic counterpart, PoseFormer, and other state-of-the-art methods using up to hundreds of video frames regarding both speed and precision. Project page: https://qitaozhao.github.io/ContextAware-PoseFormer

Qitao Zhao, Ce Zheng, Mengyuan Liu, Chen Chen• 2023

Related benchmarks

TaskDatasetResultRank
3D Human Pose EstimationMPI-INF-3DHP (test)
PCK98.2
559
3D Human Pose EstimationHuman3.6M (test)--
547
3D Human Pose EstimationHuman3.6M
MPJPE39.8
160
3D Human Pose EstimationMPI-INF-3DHP--
108
3D Human Pose EstimationHuman3.6M (S9, S11)
Average Error (MPJPE Avg)43.4
94
Motion In-betweeningHumanoid User Study (test)
Similar Score13.1
5
Motion In-betweeningMixamo Humanoid
HL2Q0.0971
5
Motion In-betweeningTruebones Zoo Non-humanoid
HL2Q0.2465
3
Motion In-betweeningNon-humanoid User Study (test)
Similarity3.97
3
3D Human Pose EstimationHuman3.6M challenging subset
MPJPE (mm)82.4
2
Showing 10 of 11 rows

Other info

Code

Follow for update