Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

THUNDR: Transformer-based 3D HUmaN Reconstruction with Markers

About

We present THUNDR, a transformer-based deep neural network methodology to reconstruct the 3d pose and shape of people, given monocular RGB images. Key to our methodology is an intermediate 3d marker representation, where we aim to combine the predictive power of model-free-output architectures and the regularizing, anthropometrically-preserving properties of a statistical human surface model like GHUM -- a recently introduced, expressive full body statistical 3d human model, trained end-to-end. Our novel transformer-based prediction pipeline can focus on image regions relevant to the task, supports self-supervised regimes, and ensures that solutions are consistent with human anthropometry. We show state-of-the-art results on Human3.6M and 3DPW, for both the fully-supervised and the self-supervised models, for the task of inferring 3d human shape, joint positions, and global translation. Moreover, we observe very solid 3d reconstruction performance for difficult human poses collected in the wild.

Mihai Zanfir, Andrei Zanfir, Eduard Gabriel Bazavan, William T. Freeman, Rahul Sukthankar, Cristian Sminchisescu• 2021

Related benchmarks

TaskDatasetResultRank
3D Human Pose EstimationHuman3.6M (test)--
547
3D Human Pose EstimationHuman3.6M (Protocol #1)
MPJPE (Avg.)55
440
3D Human Pose EstimationHuman3.6M (Protocol 2)
Average MPJPE48
315
3D Human Pose and Shape Estimation3DPW (test)
MPJPE-PA51.5
158
Human Mesh Recovery3DPW
PA-MPJPE51.5
123
Human Mesh ReconstructionHuman3.6M
PA-MPJPE39.8
50
3D Body Mesh RecoveryHuman3.6M
PA-MPJPE34.9
46
3D Human Mesh Estimation3DPW
PA MPJPE51.5
42
3D Human Pose EstimationHuman3.6M v1 (Protocol #2)
P-MPJPE (Avg)34.9
33
3D Human Pose and Mesh Reconstruction3DPW (test)
PA-MPJPE51.5
20
Showing 10 of 10 rows

Other info

Follow for update