Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

From Frames to Sequences: Temporally Consistent Human-Centric Dense Prediction

About

In this work, we focus on the challenge of temporally consistent human-centric dense prediction across video sequences. Existing models achieve strong per-frame accuracy but often flicker under motion, occlusion, and lighting changes, and they rarely have paired human video supervision for multiple dense tasks. We address this gap with a scalable synthetic data pipeline that generates photorealistic human frames and motion-aligned sequences with pixel-accurate depth, normals, and masks. Unlike prior static data synthetic pipelines, our pipeline provides both frame-level labels for spatial learning and sequence-level supervision for temporal learning. Building on this, we train a unified ViT-based dense predictor that (i) injects an explicit human geometric prior via CSE embeddings and (ii) improves geometry-feature reliability with a lightweight channel reweighting module after feature fusion. Our two-stage training strategy, combining static pretraining with dynamic sequence supervision, enables the model first to acquire robust spatial representations and then refine temporal consistency across motion-aligned sequences. Extensive experiments show that we achieve state-of-the-art performance on THuman2.1 and Hi4D and generalize effectively to in-the-wild videos.

Xingyu Miao, Junting Dong, Qin Zhao, Yuhang Yang, Junhao Chen, Yang Long• 2026

Related benchmarks

TaskDatasetResultRank
Image MattingP3M-500-NP
SAD (Trimap)11.88
27
Surface Normal EstimationHi4D
MAE15
18
Image MattingP3M-500-P
SAD11.63
16
Depth EstimationHi4D (test)
RMSE0.07
15
Depth EstimationTHuman Face 2.1 (test)
RMSE0.0147
15
Depth EstimationTHuman UpperBody 2.1 (test)
RMSE0.0174
15
Depth EstimationTHuman FullBody 2.1 (test)
RMSE0.0218
15
Video Depth EstimationHi4D
OPW0.007
13
Surface Normal EstimationTHuman 2.1
Mean Angular Error16
10
Human MattingPPM-100
SAD70.71
6
Showing 10 of 10 rows

Other info

Follow for update