BodyNet: Volumetric Inference of 3D Human Body Shapes

About

Human shape estimation is an important task for video editing, animation and fashion industry. Predicting 3D human body shape from natural images, however, is highly challenging due to factors such as variation in human bodies, clothing and viewpoint. Prior methods addressing this problem typically attempt to fit parametric body models with certain priors on pose and shape. In this work we argue for an alternative representation and propose BodyNet, a neural network for direct inference of volumetric body shape from a single image. BodyNet is an end-to-end trainable network that benefits from (i) a volumetric 3D loss, (ii) a multi-view re-projection loss, and (iii) intermediate supervision of 2D pose, 2D body part segmentation, and 3D pose. Each of them results in performance improvement as demonstrated by our experiments. To evaluate the method, we fit the SMPL model to our network output and show state-of-the-art results on the SURREAL and Unite the People datasets, outperforming recent approaches. Besides achieving state-of-the-art performance, our method also enables volumetric body-part segmentation.

G\"ul Varol, Duygu Ceylan, Bryan Russell, Jimei Yang, Ersin Yumer, Ivan Laptev, Cordelia Schmid• 2018

Related benchmarks

Task	Dataset	Result
3D Human Pose Estimation	Human3.6M (Protocol #1)	MPJPE (Avg.)49	457
Foreground-Background Segmentation	LSP (test)	Accuracy92.75	36
3D human body pose and mesh estimation	Surreal (test)	MPJPE40.8	30
3D human reconstruction	BUFF (test)	P2S Distance4.94	23
3D Human Pose Estimation	H36M	MPJPE51.6	19
3D human reconstruction	RenderPeople (test)	Normal Error0.26	16
3D human reconstruction	RenderPeople	Normal Error0.262	12
3D human reconstruction	BUFF	P2S Distance4.94	11
Volumetric prediction	SURREAL (full)	Avg SMPL Surface Error65.8	10
3D Human Body Shape Estimation	UP dataset	Accuracy92.97	10

Showing 10 of 16 rows

Other info

Code

Follow for update

@wizwand_team Discord