Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

MetricHMSR:Metric Human Mesh and Scene Recovery from Monocular Images

About

We introduce MetricHMSR, a novel framework for recovering metric human meshes and 3D scenes from a single monocular image. Existing methods struggle to recover metric scale due to monocular scale ambiguity and weak-perspective camera assumptions. Moreover, their fully coupled feature representations make it difficult to disentangle local pose from global translation, often requiring multi-stage pipelines that introduce accumulated errors. To address these challenges, we propose MetricHMR (Metric Human Mesh Recovery), which incorporates a bounding camera ray map representation to provide explicit metric cues for human reconstruction,together with a Human Mixture-of-Experts (HumanMoE) that dynamically routes image features to specialized experts, enabling the disentangled perception of local human pose and global metric position. Leveraging the recovered metric human as a geometric anchor, we further refine monocular metric depth estimation to achieve more accurate 3D alignment between humans and scenes.Comprehensive experiments demonstrate that our method achieves state-of-the-art performance on both human mesh recovery and metric human-scene reconstruction. Project Page: https://Metaverse-AI-Lab-THU.github.io/MetricHMSR.

Chentao Song, He Zhang, Haolei Yuan, Haozhe Lin, Jianhua Tao, Hongwen Zhang, Tao Yu• 2025

Related benchmarks

TaskDatasetResultRank
3D Human Pose Estimation3DPW
PA-MPJPE33.6
127
Global human motion estimationRICH
WA-MPJPE109.6
21
Global motion and trajectory estimationEMDB 2
WA-MPJPE55.6
15
Human local body pose estimationEMDB 1
PA-MPJPE43.2
7
Depth EstimationPROX
AbsRel13
4
3D Position EstimationSynFocal
RDE0.1
2
Body Shape and Height Estimation3DPW
H-MAE70.1
2
Showing 7 of 7 rows

Other info

Follow for update