StyleSSP: Sampling StartPoint Enhancement for Training-free Diffusion-based Method for Style Transfer

About

Training-free diffusion-based methods have achieved remarkable success in style transfer, eliminating the need for extensive training or fine-tuning. However, due to the lack of targeted training for style information extraction and constraints on the content image layout, training-free methods often suffer from layout changes of original content and content leakage from style images. Through a series of experiments, we discovered that an effective startpoint in the sampling stage significantly enhances the style transfer process. Based on this discovery, we propose StyleSSP, which focuses on obtaining a better startpoint to address layout changes of original content and content leakage from style image. StyleSSP comprises two key components: (1) Frequency Manipulation: To improve content preservation, we reduce the low-frequency components of the DDIM latent, allowing the sampling stage to pay more attention to the layout of content images; and (2) Negative Guidance via Inversion: To mitigate the content leakage from style image, we employ negative guidance in the inversion stage to ensure that the startpoint of the sampling stage is distanced from the content of style image. Experiments show that StyleSSP surpasses previous training-free style transfer baselines, particularly in preserving original content and minimizing the content leakage from style image. Project page: https://github.com/bytedance/StyleSSP.

Ruojun Xu, Weijie Xi, Xiaodi Wang, Yongbo Mao, Zach Cheng• 2025

Related benchmarks

Task	Dataset	Result	Rank
Style Transfer	ArtFID Benchmark (test)	ArtFID24.18		45
Multi-style Image Transfer	MS-COCO (content) & WikiArt (style) Two-style setting Stable Diffusion v1.4 backbone (test)	ArtFID27.374		9

Showing 2 of 2 rows

Other info

Follow for update

@wizwand_team Discord