Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Improving Diffusion Models for Authentic Virtual Try-on in the Wild

About

This paper considers image-based virtual try-on, which renders an image of a person wearing a curated garment, given a pair of images depicting the person and the garment, respectively. Previous works adapt existing exemplar-based inpainting diffusion models for virtual try-on to improve the naturalness of the generated visuals compared to other methods (e.g., GAN-based), but they fail to preserve the identity of the garments. To overcome this limitation, we propose a novel diffusion model that improves garment fidelity and generates authentic virtual try-on images. Our method, coined IDM-VTON, uses two different modules to encode the semantics of garment image; given the base UNet of the diffusion model, 1) the high-level semantics extracted from a visual encoder are fused to the cross-attention layer, and then 2) the low-level features extracted from parallel UNet are fused to the self-attention layer. In addition, we provide detailed textual prompts for both garment and person images to enhance the authenticity of the generated visuals. Finally, we present a customization method using a pair of person-garment images, which significantly improves fidelity and authenticity. Our experimental results show that our method outperforms previous approaches (both diffusion-based and GAN-based) in preserving garment details and generating authentic virtual try-on images, both qualitatively and quantitatively. Furthermore, the proposed customization method demonstrates its effectiveness in a real-world scenario. More visualizations are available in our project page: https://idm-vton.github.io

Yisol Choi, Sangkyung Kwak, Kyungmin Lee, Hyungwon Choi, Jinwoo Shin• 2024

Related benchmarks

TaskDatasetResultRank
Virtual Try-OnVITON-HD (test)
SSIM88.1
48
Virtual Try-OnVITON-HD 1.0 (test)
FID9.12
27
Virtual Try-OnDressCode (test)
FID3.472
23
Virtual Try-OnVITON paired HD (test)
FID5.76
19
Image Virtual Try-onVITON-HD
LPIPS0.0789
14
Virtual Try-OnVITON-HD unpaired 1.0 (test)
FID9.84
14
Virtual Try-OnDressCode 1.0 (test)
FID5.32
14
Virtual Try-OnDressCode Upper (unpaired and paired)
FIDu14.385
13
Virtual Try-OnDressCode Dresses (unpaired and paired)
FIDu19.745
13
Virtual Try-OnDressCode Lower unpaired and paired
FID (Unpaired)21.554
13
Showing 10 of 25 rows

Other info

Follow for update