Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Back to Optimization: Diffusion-based Zero-Shot 3D Human Pose Estimation

About

Learning-based methods have dominated the 3D human pose estimation (HPE) tasks with significantly better performance in most benchmarks than traditional optimization-based methods. Nonetheless, 3D HPE in the wild is still the biggest challenge for learning-based models, whether with 2D-3D lifting, image-to-3D, or diffusion-based methods, since the trained networks implicitly learn camera intrinsic parameters and domain-based 3D human pose distributions and estimate poses by statistical average. On the other hand, the optimization-based methods estimate results case-by-case, which can predict more diverse and sophisticated human poses in the wild. By combining the advantages of optimization-based and learning-based methods, we propose the \textbf{Ze}ro-shot \textbf{D}iffusion-based \textbf{O}ptimization (\textbf{ZeDO}) pipeline for 3D HPE to solve the problem of cross-domain and in-the-wild 3D HPE. Our multi-hypothesis \textit{\textbf{ZeDO}} achieves state-of-the-art (SOTA) performance on Human3.6M, with minMPJPE $51.4$mm, without training with any 2D-3D or image-3D pairs. Moreover, our single-hypothesis \textit{\textbf{ZeDO}} achieves SOTA performance on 3DPW dataset with PA-MPJPE $40.3$mm on cross-dataset evaluation, which even outperforms learning-based methods trained on 3DPW.

Zhongyu Jiang, Zhuoran Zhou, Lei Li, Wenhao Chai, Cheng-Yen Yang, Jenq-Neng Hwang• 2023

Related benchmarks

TaskDatasetResultRank
3D Human Pose EstimationHuman3.6M (test)
MPJPE (Average)42.1
547
3D Human Pose Estimation3DPW (test)
PA-MPJPE30.6
505
3D Human Pose Estimation3DPW
PA-MPJPE40.3
119
3D Human Pose EstimationHuman3.6M (S9, S11)
Average Error (MPJPE Avg)51.4
94
3D Human Pose EstimationHuman 3.6M Subjects 9 & 11 (test)
MPJPE51.4
16
3D Human Pose EstimationMPI-INF-3DHP sampled 2929 frame (test)
MPJPE55.2
15
3D Human Pose EstimationSki-Pose (cross-dataset evaluation)
PA-MPJPE56.8
7
3D Human Pose EstimationHuman3.6M Detected 2D inputs (DT)
PA-MPJPE49
6
3D Human Pose EstimationMPI-INF-3DHP cross-domain
PCK (%)90.2
6
3D Human Pose EstimationHuman3.6M GT 2D keypoints
PA-MPJPE35.8
5
Showing 10 of 10 rows

Other info

Follow for update