Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

AHAP: Reconstructing Arbitrary Humans from Arbitrary Perspectives with Geometric Priors

About

Reconstructing 3D humans from images captured at multiple perspectives typically requires pre-calibration, like using checkerboards or MVS algorithms, which limits scalability and applicability in diverse real-world scenarios. In this work, we present AHAP (Reconstructing Arbitrary Humans from Arbitrary Perspectives), a feed-forward framework for reconstructing arbitrary humans from arbitrary camera perspectives without requiring camera calibration. Our core lies in the effective fusion of multi-view geometry to assist human association, reconstruction and localization. Specifically, we use a Cross-View Identity Association module through learnable person queries and soft assignment, supervised by contrastive learning to resolve cross-view human identity association. A Human Head fuses cross-view features and scene context for SMPL prediction, guided by cross-view reprojection losses to enforce body pose consistency. Additionally, multi-view geometry eliminates the depth ambiguity inherent in monocular methods, providing more precise 3D human localization through multi-view triangulation. Experiments on EgoHumans and EgoExo4D demonstrate that AHAP achieves competitive performance on both world-space human reconstruction and camera pose estimation, while being 180$\times$ faster than optimization-based approaches.

Xiaozhen Qiao, Wenjia Wang, Zhiyuan Zhao, Jiacheng Sun, Ping Luo, Hongyuan Zhang, Xuelong Li• 2026

Related benchmarks

TaskDatasetResultRank
Efficiency EvaluationEgoHumans (test)
Inference Time (s)0.34
7
Human Mesh RecoveryEgoHumans (test)
W-MPJPE0.88
6
Human Mesh RecoveryEgoExo4D (test)
W-MPJPE0.6
6
Showing 3 of 3 rows

Other info

Follow for update