Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

MEGA: Masked Generative Autoencoder for Human Mesh Recovery

About

Human Mesh Recovery (HMR) from a single RGB image is a highly ambiguous problem, as an infinite set of 3D interpretations can explain the 2D observation equally well. Nevertheless, most HMR methods overlook this issue and make a single prediction without accounting for this ambiguity. A few approaches generate a distribution of human meshes, enabling the sampling of multiple predictions; however, none of them is competitive with the latest single-output model when making a single prediction. This work proposes a new approach based on masked generative modeling. By tokenizing the human pose and shape, we formulate the HMR task as generating a sequence of discrete tokens conditioned on an input image. We introduce MEGA, a MaskEd Generative Autoencoder trained to recover human meshes from images and partial human mesh token sequences. Given an image, our flexible generation scheme allows us to predict a single human mesh in deterministic mode or to generate multiple human meshes in stochastic mode. Experiments on in-the-wild benchmarks show that MEGA achieves state-of-the-art performance in deterministic and stochastic modes, outperforming single-output and multi-output approaches.

Gu\'enol\'e Fiche, Simon Leglaive, Xavier Alameda-Pineda, Francesc Moreno-Noguer• 2024

Related benchmarks

TaskDatasetResultRank
3D Human Pose Estimation3DPW (test)
PA-MPJPE41
505
3D Human Mesh Recovery3DPW (test)
PA-MPJPE40.4
264
Human Mesh RecoveryEMDB (test)
PA-MPJPE52.5
19
Human Mesh Recovery3DPW in-the-wild (test)
PVE80
13
Human Mesh Recovery3DPW-OC
PA-MPJPE43.7
12
Human Motion ReconstructionRICH (test)
PA-MPJPE50.53
12
Human Motion ReconstructionEgoBody occ 1.0 (test)
PA-MPJPE37.8
9
Human Motion ReconstructionRICH 1.0 (test)
PA-MPJPE50.53
8
Showing 8 of 8 rows

Other info

Follow for update