TexturePose: Supervising Human Mesh Estimation with Texture Consistency

About

This work addresses the problem of model-based human pose estimation. Recent approaches have made significant progress towards regressing the parameters of parametric human body models directly from images. Because of the absence of images with 3D shape ground truth, relevant approaches rely on 2D annotations or sophisticated architecture designs. In this work, we advocate that there are more cues we can leverage, which are available for free in natural images, i.e., without getting more annotations, or modifying the network architecture. We propose a natural form of supervision, that capitalizes on the appearance constancy of a person among different frames (or viewpoints). This seemingly insignificant and often overlooked cue goes a long way for model-based pose estimation. The parametric model we employ allows us to compute a texture map for each frame. Assuming that the texture of the person does not change dramatically between frames, we can apply a novel texture consistency loss, which enforces that each point in the texture map has the same texture value across all frames. Since the texture is transferred in this common texture map space, no camera motion computation is necessary, or even an assumption of smoothness among frames. This makes our proposed supervision applicable in a variety of settings, ranging from monocular video, to multi-view images. We benchmark our approach against strong baselines that require the same or even more annotations that we do and we consistently outperform them. Simultaneously, we achieve state-of-the-art results among model-based pose estimation approaches in different benchmarks. The project website with videos, results, and code can be found at https://seas.upenn.edu/~pavlakos/projects/texturepose.

Georgios Pavlakos, Nikos Kolotouros, Kostas Daniilidis• 2019

Related benchmarks

Task	Dataset	Result
3D Human Pose Estimation	Human3.6M (test)	--	570
3D Human Pose Estimation	Human3.6M (Protocol #1)	MPJPE (Avg.)110.7	457
3D Human Pose Estimation	Human3.6M (Protocol 2)	--	315
3D Human Pose Estimation	Human3.6M Protocol #2 (test)	Average Error49.7	140
Human Mesh Recovery	Human3.6M	Reconstruction Error49.7	47
Foreground-Background Segmentation	LSP (test)	Accuracy91.82	36
Human Body Part Segmentation	LSP (test)	Accuracy89	34
3D Human Pose Estimation	Human3.6M Semi-supervised (S9, S11)	PMPJPE74.5	10

Showing 8 of 8 rows

Other info

Code

Follow for update

@wizwand_team Discord