Collaborative Learning for Hand and Object Reconstruction with Attention-guided Graph Convolution
About
Estimating the pose and shape of hands and objects under interaction finds numerous applications including augmented and virtual reality. Existing approaches for hand and object reconstruction require explicitly defined physical constraints and known objects, which limits its application domains. Our algorithm is agnostic to object models, and it learns the physical rules governing hand-object interaction. This requires automatically inferring the shapes and physical interaction of hands and (potentially unknown) objects. We seek to approach this challenging problem by proposing a collaborative learning strategy where two-branches of deep networks are learning from each other. Specifically, we transfer hand mesh information to the object branch and vice versa for the hand branch. The resulting optimisation (training) problem can be unstable, and we address this via two strategies: (i) attention-guided graph convolution which helps identify and focus on mutual occlusion and (ii) unsupervised associative loss which facilitates the transfer of information between the branches. Experiments using four widely-used benchmarks show that our framework achieves beyond state-of-the-art accuracy in 3D pose estimation, as well as recovers dense 3D hand and object shapes. Each technical component above contributes meaningfully in the ablation study.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| 3D Hand-Object Interaction | HO3D v2 (test) | -- | 20 | |
| 3D Hand Pose Estimation | Dex-YCB (test) | PA-MPJPE (Scene 0)16.05 | 10 | |
| Hand-Object Pose Estimation | DexYCB | MJE15.3 | 9 | |
| Hand Pose Estimation | FPHAB (action split) | Hand Error (mm)9.8 | 6 | |
| 3D Hand-Object Pose Estimation | DexYCB (test) | MJE (cm)1.53 | 4 | |
| Hand Pose Estimation | FHB (subject) | Hand Error25.3 | 3 | |
| Hand Pose Estimation | DexYCB 17 (S0) | Hand Error (mm)16.05 | 2 | |
| Hand Pose Estimation | DexYCB 17 (S1) | Hand Error (mm)21.22 | 2 | |
| Hand Pose Estimation | DexYCB 17 (S3) | Hand Error (mm)17.93 | 2 | |
| Hand Pose Estimation | DexYCB 17 (S2) | Hand Error (mm)27.01 | 2 |