TryOffDiff: Virtual-Try-Off via High-Fidelity Garment Reconstruction using Diffusion Models
About
This paper introduces Virtual Try-Off (VTOFF), a novel task generating standardized garment images from single photos of clothed individuals. Unlike Virtual Try-On (VTON), which digitally dresses models, VTOFF extracts canonical garment images, demanding precise reconstruction of shape, texture, and complex patterns, enabling robust evaluation of generative model fidelity. We propose TryOffDiff, adapting Stable Diffusion with SigLIP-based visual conditioning to deliver high-fidelity reconstructions. Experiments on VITON-HD and Dress Code datasets show that TryOffDiff outperforms adapted pose transfer and VTON baselines. We observe that traditional metrics such as SSIM inadequately reflect reconstruction quality, prompting our use of DISTS for reliable assessment. Our findings highlight VTOFF's potential to improve e-commerce product imagery, advance generative model evaluation, and guide future research on high-fidelity reconstruction. Demo, code, and models are available at: https://rizavelioglu.github.io/tryoffdiff
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Image Virtual Try-on | VITON-HD | LPIPS39.56 | 14 | |
| Virtual Try-Off | VITON-HD (test) | SSIM80.3 | 6 | |
| Virtual Try-Off | VITON-HD | SSIM77.9 | 5 | |
| Virtual Try-Off | DressCode upper-body | SSIM76.6 | 3 |