Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

GeoHand: Unlocking Prior Geometry Knowledge for Monocular 3D Hand Reconstruction

About

Monocular 3D hand reconstruction is intrinsically a geometric problem, yet RGB appearance features alone often struggle to resolve severe ambiguities caused by self-occlusions and hand-object interactions. While introducing depth can explicitly provide spatial cues, raw sensor-captured depth maps are extensively noisy and incomplete, limiting their usefulness for fine-grained hand reconstruction. To bridge this gap, we propose GeoHand, a novel framework that unlocks high-quality geometric priors from a frozen foundational monocular geometry estimator (MoGe2). Recognizing that these priors are oriented toward general scenes, we introduce a map-level GeoAdapter to recalibrate the spatial features, specifically adapting them for detailed hand reconstruction. Furthermore, to systematically integrate these adapted priors without overwhelming intrinsic RGB appearance cues, we employ a gated cross-modal token fusion strategy. Finally, to secure precise local articulation, we design a Keypoint-Queried Iterative Refiner (KQIR) that uses projected joint locations to query geometry-aware image features for spatial correction. By combining global geometric disambiguation with local refinement in a unified pipeline, GeoHand achieves state-of-the-art performance on FreiHAND, DexYCB, and HO3Dv3, especially under severe occlusions and hand-object interactions.

Weiquan Lin, Yaoqing Hu, Liangchen Dai, Xu Tang, Xingyu Chen• 2026

Related benchmarks

TaskDatasetResultRank
3D Hand ReconstructionFreiHAND
PA MPVPE5.4
33
3D Hand ReconstructionHO3D v3
PA-MPJPE6.7
25
3D Hand ReconstructionDexYCB
PA-MPJPE5
6
Showing 3 of 3 rows

Other info

Follow for update