Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

SDPose: Exploiting Diffusion Priors for Out-of-Domain and Robust Pose Estimation

About

Pre-trained diffusion models provide rich latent features across U-Net levels and are emerging as powerful vision backbones. While prior works such as Marigold and Lotus repurpose diffusion priors for dense geometric perception tasks such as depth and surface normal estimation, their potential for cross-domain human pose estimation remains largely unexplored. Through a systematic analysis of latent features from different upsampling levels of the Stable Diffusion U-Net, we identify the levels that deliver the strongest robustness and cross-domain generalization for pose estimation. Building on these findings, we propose \textbf{SDPose}, which (i) extracts U-Net features from the selected upsampling blocks, (ii) fuses them with a lightweight feature aggregation module to form a robust representation, and (iii) jointly optimizes keypoint heatmap supervision with an auxiliary latent reconstruction loss to regularize training and preserve the pre-trained generative prior. To evaluate cross-domain generalization and robustness, we construct COCO-OOD, a COCO-based benchmark with four subsets: three style-transferred splits to assess domain shift, and one corruption split (noise, weather, digital artifacts, and blur) to test robustness. With a shorter fine-tuning schedule, SDPose achieves performance comparable to Sapiens on COCO, surpasses Sapiens-1B on COCO-WholeBody, and establishes new state-of-the-art results on HumanArt and COCO-OOD.

Shuang Liang, Jing He, Chuanmeizhi Wang, Lejun Liao, Guo Zhang, Yingcong Chen, Yuan Yuan• 2025

Related benchmarks

TaskDatasetResultRank
Human Pose EstimationCOCO (val)
AP81.2
57
Whole-body Pose EstimationCOCO-WholeBody (val)
Whole AP72.8
25
Human Keypoint EstimationHuman-Art (val)
AP71.8
19
Body Pose EstimationHumanART
AP71.8
4
Whole-body Pose EstimationCOCO-OOD-Monet Wholebody (val)
Body AP61.3
4
Wholebody Pose EstimationCOCO-OOD-Monet (val)
Left Hand AP48.8
4
Body Pose EstimationCOCO-OOD Monet (Body)
AP64
3
Body Pose EstimationCOCO-OOD Ukiyo-e (Body)
AP66.1
3
Wholebody Pose EstimationCOCO-OOD Ukiyo-e Wholebody
AP50
3
Wholebody Pose EstimationCOCO-OOD Corruption Wholebody
AP54.3
3
Showing 10 of 14 rows

Other info

Follow for update