NGL-Prompter: Training-Free Sewing Pattern Estimation from a Single Image
About
Estimating sewing patterns from images is a practical approach for creating high-quality 3D garments. Due to the lack of real-world pattern-image paired data, prior approaches fine-tune large vision language models (VLMs) on synthetic garment datasets generated by randomly sampling from a parametric garment model GarmentCode. However, these methods often struggle to generalize to in-the-wild images, fail to capture real-world correlations between garment parts, and are typically restricted to single-layer outfits. In contrast, we observe that VLMs are effective at describing garments in natural language, yet perform poorly when asked to directly regress GarmentCode parameters from images. To bridge this gap, we propose NGL (Natural Garment Language), a novel intermediate language that restructures GarmentCode into a representation more understandable to language models. Leveraging this language, we introduce NGL-Prompter, a training-free pipeline that queries large VLMs to extract structured garment parameters, which are then deterministically mapped to valid GarmentCode. We evaluate our method on the Dress4D, CloSe and a newly collected dataset of approximately 5,000 in-the-wild fashion images. Our approach achieves state-of-the-art performance on standard geometry metrics and is strongly preferred in both human and GPT-based perceptual evaluations compared to existing baselines. Furthermore, NGL-prompter can recover multi-layer outfits whereas competing methods focus mostly on single-layer garments, highlighting its strong generalization to real-world images even with occluded parts. These results demonstrate that accurate sewing pattern reconstruction is possible without costly model training. Our code and data will be released for research use.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Sewing pattern estimation | CloSE single layer | Chamfer Distance2.08 | 7 | |
| Sewing Pattern Reconstruction | Dress4D single layer | CD1.19 | 7 | |
| Human evaluation of garment reconstruction | ASOS 5K Multi-layer | Mean Score4.052 | 5 | |
| Human evaluation of garment reconstruction | ASOS_5K Single-layer | Mean Score4.797 | 5 | |
| Text-to-garment reconstruction | ASOS_5K Single-layer | CLIP Score (Short Text)0.29 | 3 | |
| Text-to-garment reconstruction | ASOS_5K Multi-Layer | Short Text CLIP Score0.276 | 3 |