UniHuman: A Unified Model for Editing Human Images in the Wild
About
Human image editing includes tasks like changing a person's pose, their clothing, or editing the image according to a text prompt. However, prior work often tackles these tasks separately, overlooking the benefit of mutual reinforcement from learning them jointly. In this paper, we propose UniHuman, a unified model that addresses multiple facets of human image editing in real-world settings. To enhance the model's generation quality and generalization capacity, we leverage guidance from human visual encoders and introduce a lightweight pose-warping module that can exploit different pose representations, accommodating unseen textures and patterns. Furthermore, to bridge the disparity between existing human editing benchmarks with real-world data, we curated 400K high-quality human image-text pairs for training and collected 2K human images for out-of-domain testing, both encompassing diverse clothing styles, backgrounds, and age groups. Experiments on both in-domain and out-of-domain test sets demonstrate that UniHuman outperforms task-specific models by a significant margin. In user studies, UniHuman is preferred by the users in an average of 77% of cases. Our project is available at https://github.com/NannanLi999/UniHuman.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Reposing | DeepFashion In-Domain | FID5.089 | 10 | |
| Reposing | WPose (Out-of-Domain) | FID27.571 | 10 | |
| Human Avatar Generation | WPose out-of-domain (test) | PSNR17.49 | 8 | |
| Pose-conditioned avatar generation | WPose (Out-of-Domain) | M-PSNR17.64 | 8 | |
| Human Avatar Generation | DeepFashion In-Domain (test) | PSNR13.87 | 8 | |
| Pose-conditioned avatar generation | DeepFashion In-Domain | PSNR14.05 | 8 | |
| Virtual Try-On | DressCode Paired 512x512 | FID3.446 | 5 | |
| Virtual Try-On | DressCode 512x512 (Unpaired) | FID5.529 | 5 | |
| Reposing | WPose out-of-domain (test) | Pose Accuracy84.3 | 4 | |
| Virtual Try-On | WVTON Out-of-Domain (test) | Texture Consistency76.4 | 4 |