Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

AHOY! Animatable Humans under Occlusion from YouTube Videos with Gaussian Splatting and Video Diffusion Priors

About

We present AHOY, a method for reconstructing complete, animatable 3D Gaussian avatars from in-the-wild monocular video despite heavy occlusion. Existing methods assume unoccluded input-a fully visible subject, often in a canonical pose-excluding the vast majority of real-world footage where people are routinely occluded by furniture, objects, or other people. Reconstructing from such footage poses fundamental challenges: large body regions may never be observed, and multi-view supervision per pose is unavailable. We address these challenges with four contributions: (i) a hallucination-as-supervision pipeline that uses identity-finetuned diffusion models to generate dense supervision for previously unobserved body regions; (ii) a two-stage canonical-to-pose-dependent architecture that bootstraps from sparse observations to full pose-dependent Gaussian maps; (iii) a map-pose/LBS-pose decoupling that absorbs multi-view inconsistencies from the generated data; (iv) a head/body split supervision strategy that preserves facial identity. We evaluate on YouTube videos and on multi-view capture data with significant occlusion and demonstrate state-of-the-art reconstruction quality. We also demonstrate that the resulting avatars are robust enough to be animated with novel poses and composited into 3DGS scenes captured using cell-phone video. Our project page is available at https://miraymen.github.io/ahoy/

Aymen Mir, Riza Alp Guler, Xiangjun Tang, Peter Wonka, Gerard Pons-Moll• 2026

Related benchmarks

TaskDatasetResultRank
Novel View SynthesisBEHAVE Novel View
PSNR24.12
5
Static ReconstructionYouTube Static Occluded
PSNR22.01
5
Static ReconstructionYouTube Static Canonical
PSNR23.05
5
AnimationYouTube Occluded input
PSNR22.81
3
AnimationYouTube Canonical-pose input
PSNR22.83
3
Novel Pose SynthesisBEHAVE Novel Pose
PSNR22.81
3
Showing 6 of 6 rows

Other info

Follow for update