Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

D$^4$-VTON: Dynamic Semantics Disentangling for Differential Diffusion based Virtual Try-On

About

In this paper, we introduce D$^4$-VTON, an innovative solution for image-based virtual try-on. We address challenges from previous studies, such as semantic inconsistencies before and after garment warping, and reliance on static, annotation-driven clothing parsers. Additionally, we tackle the complexities in diffusion-based VTON models when handling simultaneous tasks like inpainting and denoising. Our approach utilizes two key technologies: Firstly, Dynamic Semantics Disentangling Modules (DSDMs) extract abstract semantic information from garments to create distinct local flows, improving precise garment warping in a self-discovered manner. Secondly, by integrating a Differential Information Tracking Path (DITP), we establish a novel diffusion-based VTON paradigm. This path captures differential information between incomplete try-on inputs and their complete versions, enabling the network to handle multiple degradations independently, thereby minimizing learning ambiguities and achieving realistic results with minimal overhead. Extensive experiments demonstrate that D$^4$-VTON significantly outperforms existing methods in both quantitative metrics and qualitative evaluations, demonstrating its capability in generating realistic images and ensuring semantic consistency.

Zhaotong Yang, Zicheng Jiang, Xinzhe Li, Huiyu Zhou, Junyu Dong, Huaidong Zhang, Yong Du• 2024

Related benchmarks

TaskDatasetResultRank
Virtual Try-OnVITON-HD (test)
SSIM79
48
Virtual Try-OnStreetTryOn Shop-to-Street
FID35.003
13
Virtual Try-OnDressCode Upper (unpaired and paired)
FIDu20.726
13
Virtual Try-OnDressCode Lower unpaired and paired
FID (Unpaired)34.088
13
Virtual Try-OnDressCode Dresses (unpaired and paired)
FIDu42.23
13
Showing 5 of 5 rows

Other info

Follow for update