Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

HARP: Personalized Hand Reconstruction from a Monocular RGB Video

About

We present HARP (HAnd Reconstruction and Personalization), a personalized hand avatar creation approach that takes a short monocular RGB video of a human hand as input and reconstructs a faithful hand avatar exhibiting a high-fidelity appearance and geometry. In contrast to the major trend of neural implicit representations, HARP models a hand with a mesh-based parametric hand model, a vertex displacement map, a normal map, and an albedo without any neural components. As validated by our experiments, the explicit nature of our representation enables a truly scalable, robust, and efficient approach to hand avatar creation. HARP is optimized via gradient descent from a short sequence captured by a hand-held mobile phone and can be directly used in AR/VR applications with real-time rendering capability. To enable this, we carefully design and implement a shadow-aware differentiable rendering scheme that is robust to high degree articulations and self-shadowing regularly present in hand motion sequences, as well as challenging lighting conditions. It also generalizes to unseen poses and novel viewpoints, producing photo-realistic renderings of hand animations performing highly-articulated motions. Furthermore, the learned HARP representation can be used for improving 3D hand pose estimation quality in challenging viewpoints. The key advantages of HARP are validated by the in-depth analyses on appearance reconstruction, novel-view and novel pose synthesis, and 3D hand pose refinement. It is an AR/VR-ready personalized hand representation that shows superior fidelity and scalability.

Korrawe Karunratanakul, Sergey Prokudin, Otmar Hilliges, Siyu Tang• 2022

Related benchmarks

TaskDatasetResultRank
Novel View SynthesisInterHand2.6M (test)
LPIPS0.1367
12
Appearance reconstructionInterHand2.6M (test)
L1 Loss0.0157
8
Appearance reconstructionRGB2Hands
L1 Loss0.0155
4
Novel Pose ReconstructionInterHand 2.6M (test)
L1 Error0.0256
4
Novel PosesRGB2Hands
L1 Loss0.0208
4
3D Hand Avatar ReconstructionHARP subject_1 (sequences 6-9) (test)
PSNR27.5
3
3D Hand Avatar ReconstructionPhone scan dataset (test)
PSNR29.89
3
Contact estimationMANUS-Grasps (Subject1)
mIoU0.173
3
Contact estimationMANUS-Grasps (Subject2)
mIoU14.8
3
Contact estimationMANUS-Grasps (Subject3)
mIoU0.224
3
Showing 10 of 10 rows

Other info

Follow for update