VectorArk: Learning Practical Image Vectorization with Rounded Polygon Representation

About

Recent vision-language model (VLM)-based approaches have achieved impressive results on image vectorization tasks. However, they are typically evaluated on synthetic benchmarks, where clean SVGs are rasterized at high resolution and then re-vectorized. As a result, these methods generalize poorly to real-world scenarios, such as images with unknown rasterization methods or those generated by text-to-image models. We introduce VectorArk, a new VLM-based model designed for robust and practical image vectorization. VectorArk employs a novel rounded polygon representation that simplifies the learning process while naturally producing smooth, visually appealing primitives. We also propose a degradation model that enhances robustness across diverse and imperfect inputs. Our experiments show that, in contrast to previous methods, VectorArk achieves superior geometric completeness and artifact suppression across multiple datasets, with comprehensive ablations validating the contribution of each component.

Tarun Gehlaut, Difan Liu, Charu Bansal, Krutik Malani, Souymodip Chakraborty, Ankit Phogat, Matthew Fisher, Vineet Batra• 2026

Related benchmarks

Task	Dataset	Result
Image Vectorization	SVG-Stack-Simple (Easy)	LPIPS0.0274	10
Image Vectorization	SVG-Fonts Simple (Easy)	LPIPS0.0217	10
Image Vectorization	SVG-Icons-Simple (Easy)	LPIPS0.0565	10
Image Vectorization	SVG-Emoji-Simple (Easy)	LPIPS0.0282	10
Image Vectorization	MMSVG-Icon Easy (test)	LPIPS0.0122	5
Image Vectorization	MMSVG-Icon Med (test)	LPIPS0.0179	5
Image Vectorization	MMSVG-Icon Hard (test)	LPIPS0.0324	5
Image Vectorization	SVGX core 250k Easy (test)	LPIPS0.0168	5
Image Vectorization	SVGX core 250k Med (test)	LPIPS0.0332	5
Image Vectorization	SVGX_core_250k Hard (test)	LPIPS0.0677	5

Showing 10 of 35 rows

Other info

Follow for update

@wizwand_team Discord