3D Human Motion Estimation via Motion Compression and Refinement

About

We develop a technique for generating smooth and accurate 3D human pose and motion estimates from RGB video sequences. Our method, which we call Motion Estimation via Variational Autoencoder (MEVA), decomposes a temporal sequence of human motion into a smooth motion representation using auto-encoder-based motion compression and a residual representation learned through motion refinement. This two-step encoding of human motion captures human motion in two stages: a general human motion estimation step that captures the coarse overall motion, and a residual estimation that adds back person-specific motion details. Experiments show that our method produces both smooth and accurate 3D human pose and motion estimates.

Zhengyi Luo, S. Alireza Golestaneh, Kris M. Kitani• 2020

Related benchmarks

Task	Dataset	Result
3D Human Pose Estimation	MPI-INF-3DHP (test)	--	606
3D Human Pose Estimation	Human3.6M (test)	--	570
3D Human Pose Estimation	3DPW (test)	PA-MPJPE54.7	527
3D Human Pose and Shape Estimation	3DPW (test)	MPJPE-PA54.7	158
3D Human Mesh Recovery	Human3.6M (test)	PA-MPJPE53.2	145
3D Human Pose and Shape Estimation	Human3.6M (test)	PA-MPJPE53.2	119
3D Human Pose and Shape Estimation	3DPW	PA-MPJPE54.7	74
3D Human Pose and Shape Estimation	MPI-INF-3DHP (test)	MPJPE96.4	46
3D Human Mesh Estimation	3DPW (test)	PA-MPJPE54.7	44
3D Human Pose and Mesh Recovery	3DPW	PA-MPJPE54.7	43

Showing 10 of 19 rows

Other info

Follow for update

@wizwand_team Discord