Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Self-Attentive 3D Human Pose and Shape Estimation from Videos

About

We consider the task of estimating 3D human pose and shape from videos. While existing frame-based approaches have made significant progress, these methods are independently applied to each image, thereby often leading to inconsistent predictions. In this work, we present a video-based learning algorithm for 3D human pose and shape estimation. The key insights of our method are two-fold. First, to address the inconsistent temporal prediction issue, we exploit temporal information in videos and propose a self-attention module that jointly considers short-range and long-range dependencies across frames, resulting in temporally coherent estimations. Second, we model human motion with a forecasting module that allows the transition between adjacent frames to be smooth. We evaluate our method on the 3DPW, MPI-INF-3DHP, and Human3.6M datasets. Extensive experimental results show that our algorithm performs favorably against the state-of-the-art methods.

Yun-Chun Chen, Marco Piccirilli, Robinson Piramuthu, Ming-Hsuan Yang• 2021

Related benchmarks

TaskDatasetResultRank
3D Human Pose Estimation3DPW (test)
PA-MPJPE50.4
505
3D Human Pose and Shape EstimationHuman3.6M (test)
PA-MPJPE38.7
119
3D Human Mesh RecoveryHuman3.6M (Protocol 2)
Reconstruction Error58.9
29
3D Human Pose and Shape Estimation3DPW 2018 (test)
PA-MPJPE50.4
10
3D Human Pose and Shape EstimationMPI-INF-3DHP 2017a (test)
PA-MPJPE60.7
4
Showing 5 of 5 rows

Other info

Follow for update