UDAPose: Unsupervised Domain Adaptation for Low-Light Human Pose Estimation
About
Low-visibility scenarios, such as low-light conditions, pose significant challenges to human pose estimation due to the scarcity of annotated low-light datasets and the loss of visual information under poor illumination. Recent domain adaptation techniques attempt to utilize well-lit labels by augmenting well-lit images to mimic low-light conditions. But handcrafted augmentations oversimplify noise patterns, while learning-based methods often fail to preserve high-frequency low-light characteristics, producing unrealistic images that lead pose models to generalize poorly to real low-light scenes. Moreover, recent pose estimators rely on image cues through image-to-keypoint cross-attention, but these cues become unreliable under low-light conditions. To address these issues, we propose Unsupervised Domain Adaptation for Pose Estimation (UDAPose), a novel framework that synthesizes low-light images and dynamically fuses visual cues with pose priors for improved pose estimation. Specifically, our synthesis method incorporates a Direct-Current-based High-Pass Filter (DHF) and a Low-light Characteristics Injection Module (LCIM) to inject high-frequency details from input low-light images, overcoming rigidity or the detail loss in existing approaches. Furthermore, we introduce a Dynamic Control of Attention (DCA) module that adaptively balances image cues with learned pose priors in the Transformer architecture. Experiments show that UDAPose outperforms state-of-the-art methods, with notable AP gains of 10.1 (56.4%) on the ExLPose-test hard set (LL-H) and 7.4 (31.4%) in cross-dataset validation on EHPT-XC. Code: https://github.com/Vision-and-Multimodal-Intelligence-Lab/UDAPose
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Human Pose Estimation | ExLPose LL-N (test) | AP@0.5:0.9539.4 | 21 | |
| Human Pose Estimation | ExLPose LL-E (test) | AP (0.5:0.95)11.7 | 21 | |
| Human Pose Estimation | ExLPose WL (test) | AP@0.5:0.9567.3 | 18 | |
| Human Pose Estimation | ExLPose LL-H (test) | AP@0.5:0.9528 | 18 | |
| 2D Human Pose Estimation | ExLPose WL | AR@.50:.9575 | 11 | |
| 2D Human Pose Estimation | ExLPose LL-A (test) | AR@.50:.9536.5 | 11 | |
| 2D Human Pose Estimation | ExLPose LL-H (test) | AR@.50:.9537.4 | 11 | |
| 2D Human Pose Estimation | ExLPose LL-E (test) | Average Recall (AR@.50:.95)20.4 | 11 | |
| Human Pose Estimation | ExLPose-OCN | AP@.50:.95 (Avg)51.4 | 11 | |
| Human Pose Estimation | EHPT-XC (cross-dataset) | AP@.50:.9531 | 11 |