Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Look Ma, no markers: holistic performance capture without the hassle

About

We tackle the problem of highly-accurate, holistic performance capture for the face, body and hands simultaneously. Motion-capture technologies used in film and game production typically focus only on face, body or hand capture independently, involve complex and expensive hardware and a high degree of manual intervention from skilled operators. While machine-learning-based approaches exist to overcome these problems, they usually only support a single camera, often operate on a single part of the body, do not produce precise world-space results, and rarely generalize outside specific contexts. In this work, we introduce the first technique for marker-free, high-quality reconstruction of the complete human body, including eyes and tongue, without requiring any calibration, manual intervention or custom hardware. Our approach produces stable world-space results from arbitrary camera rigs as well as supporting varied capture environments and clothing. We achieve this through a hybrid approach that leverages machine learning models trained exclusively on synthetic data and powerful parametric models of human shape and motion. We evaluate our method on a number of body, face and hand reconstruction benchmarks and demonstrate state-of-the-art results that generalize on diverse datasets.

Charlie Hewitt, Fatemeh Saleh, Sadegh Aliakbarian, Lohit Petikam, Shideh Rezaeifar, Louis Florentin, Zafiirah Hosenie, Thomas J Cashman, Julien Valentin, Darren Cosker, Tadas Baltrusaitis• 2024

Related benchmarks

TaskDatasetResultRank
Human Mesh RecoveryMoYo
MPJPE60.15
16
3D Human Pose EstimationChi3D
MPJPE46.47
15
Human Mesh RecoveryRICH--
13
Human Pose EstimationHarmony4D
PVE45.6
9
Hand Pose EstimationFreiHAND (test)
PA-MPVPE8.1
7
3D human mesh fittingMammaEval-S
MPJPE25.97
5
3D human mesh fittingMammaEval-D
MPJPE27.98
5
3D human reconstructionHarmony4D + CHI3D + MammaEval-D (test)
Mean Perceptual Depth (mm)13.73
5
2D Landmark PredictionHarmony4D IoU > 0.5
Mean 2D Euclidean Distance Error (pixels)31.45
4
2D Landmark PredictionRICH
Mean 2D Euclidean Distance Error (pixels)13.26
4
Showing 10 of 14 rows

Other info

Follow for update