Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Predicting Camera Viewpoint Improves Cross-dataset Generalization for 3D Human Pose Estimation

About

Monocular estimation of 3d human pose has attracted increased attention with the availability of large ground-truth motion capture datasets. However, the diversity of training data available is limited and it is not clear to what extent methods generalize outside the specific datasets they are trained on. In this work we carry out a systematic study of the diversity and biases present in specific datasets and its effect on cross-dataset generalization across a compendium of 5 pose datasets. We specifically focus on systematic differences in the distribution of camera viewpoints relative to a body-centered coordinate frame. Based on this observation, we propose an auxiliary task of predicting the camera viewpoint in addition to pose. We find that models trained to jointly predict viewpoint and pose systematically show significantly improved cross-dataset generalization.

Zhe Wang, Daeyun Shin, Charless C. Fowlkes• 2020

Related benchmarks

TaskDatasetResultRank
3D Human Pose EstimationMPI-INF-3DHP (test)
PCK76.1
559
3D Human Pose EstimationHuman3.6M (test)--
547
3D Human Pose Estimation3DPW (test)--
505
3D Human Pose Estimation3DPW--
119
3D Human Pose EstimationMPI-INF-3DHP
PCK84.3
108
3D Human Pose Estimation3DPW cross-dataset (test)
PA-MPJPE68.3
27
3D Human Pose EstimationH36M
MPJPE52
16
3D Pose EstimationSURREAL
3D Pose Error37.1
7
3D Pose EstimationH36M 14-joint skeleton (test)
MPJPE52
6
3D Human Pose EstimationGPA
MPJPE53.3
2
Showing 10 of 10 rows

Other info

Code

Follow for update