Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Compressed Volumetric Heatmaps for Multi-Person 3D Pose Estimation

About

In this paper we present a novel approach for bottom-up multi-person 3D human pose estimation from monocular RGB images. We propose to use high resolution volumetric heatmaps to model joint locations, devising a simple and effective compression method to drastically reduce the size of this representation. At the core of the proposed method lies our Volumetric Heatmap Autoencoder, a fully-convolutional network tasked with the compression of ground-truth heatmaps into a dense intermediate representation. A second model, the Code Predictor, is then trained to predict these codes, which can be decompressed at test time to re-obtain the original representation. Our experimental evaluation shows that our method performs favorably when compared to state of the art on both multi-person and single-person 3D human pose estimation datasets and, thanks to our novel compression strategy, can process full-HD images at the constant runtime of 8 fps regardless of the number of subjects in the scene. Code and models available at https://github.com/fabbrimatteo/LoCO .

Matteo Fabbri, Fabio Lanzi, Simone Calderara, Stefano Alletto, Rita Cucchiara• 2020

Related benchmarks

TaskDatasetResultRank
3D Human Pose EstimationHuman3.6M (Protocol #1)
MPJPE (Avg.)43.4
440
3D Human Pose EstimationHuman3.6M (Protocol 2)
Average MPJPE49.1
315
Multi-person 3D Pose EstimationMuPoTS-3D (test)--
41
Multi-person 3D Human Pose EstimationCMU Panoptic
MPJPE (Mean) [mm]69
37
3D Human Pose EstimationJTA (test)
F1 Score @ 0.4m50.82
15
3D Human Pose EstimationCMU Panoptic (test)
MPJPE69
15
3D Multi-person Pose EstimationJTA synthetic (test)
F1 (t=0.4m)50.82
3
Showing 7 of 7 rows

Other info

Code

Follow for update