Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

HandFoldingNet: A 3D Hand Pose Estimation Network Using Multiscale-Feature Guided Folding of a 2D Hand Skeleton

About

With increasing applications of 3D hand pose estimation in various human-computer interaction applications, convolution neural networks (CNNs) based estimation models have been actively explored. However, the existing models require complex architectures or redundant computational resources to trade with the acceptable accuracy. To tackle this limitation, this paper proposes HandFoldingNet, an accurate and efficient hand pose estimator that regresses the hand joint locations from the normalized 3D hand point cloud input. The proposed model utilizes a folding-based decoder that folds a given 2D hand skeleton into the corresponding joint coordinates. For higher estimation accuracy, folding is guided by multi-scale features, which include both global and joint-wise local features. Experimental results show that the proposed model outperforms the existing methods on three hand pose benchmark datasets with the lowest model parameter requirement. Code is available at https://github.com/cwc1260/HandFold.

Wencan Cheng, Jae Hyun Park, Jong Hwan Ko• 2021

Related benchmarks

TaskDatasetResultRank
3D Hand Pose EstimationNYU (test)
Mean Error (mm)8.42
100
3D Hand Pose EstimationICVL (test)
Mean Error (mm)5.95
91
3D Hand Pose EstimationMSRA
Mean Error (mm)7.34
32
Hand Pose EstimationNYU (test)
3D Error (mm)8.58
25
3D Hand Pose EstimationMSRA (test)
3D Error (mm)7.34
23
3D Hand Pose EstimationNYU
Mean Distance Error (mm)8.58
19
3D Hand Pose EstimationICVL
Mean Distance Error (mm)5.95
17
Hand Pose EstimationMSRA (leave-one-subject-out)
Mean Error (mm)7.34
12
3D Hand Pose Estimation3D Hand Pose Estimation Benchmarks (test)
FPS84
4
Showing 9 of 9 rows

Other info

Code

Follow for update