OmniConsistency: Learning Style-Agnostic Consistency from Paired Stylization Data

About

Diffusion models have advanced image stylization significantly, yet two core challenges persist: (1) maintaining consistent stylization in complex scenes, particularly identity, composition, and fine details, and (2) preventing style degradation in image-to-image pipelines with style LoRAs. GPT-4o's exceptional stylization consistency highlights the performance gap between open-source methods and proprietary models. To bridge this gap, we propose \textbf{OmniConsistency}, a universal consistency plugin leveraging large-scale Diffusion Transformers (DiTs). OmniConsistency contributes: (1) an in-context consistency learning framework trained on aligned image pairs for robust generalization; (2) a two-stage progressive learning strategy decoupling style learning from consistency preservation to mitigate style degradation; and (3) a fully plug-and-play design compatible with arbitrary style LoRAs under the Flux framework. Extensive experiments show that OmniConsistency significantly enhances visual coherence and aesthetic quality, achieving performance comparable to commercial state-of-the-art model GPT-4o.

Yiren Song, Cheng Liu, Mike Zheng Shou• 2025

Related benchmarks

Task	Dataset	Result
reference-guided style transfer	OmniConsistency-Bench	FID88.282	20
Style Transfer	CSG-Bench	FID91.14	20
model-free try-on	Omni-TryOn (test)	DINO-I36.06	11
try-off	Omni-TryOn	CLIP-I88.53	10
Controllable Style Generation	CSG-Bench Text-guided	Content Preference Rate6.9	9
Controllable Style Generation	CSG-Bench Reference-guided	Content Preference Rate3.2	9
Video Style Transfer	3D Chibi Style	Subject Consistency97.11	5
Video Style Transfer	American Cartoon Style	Subject Consistency97.12	5
Video Style Transfer	Ghibli Studio Style	Subject Consistency96.89	3

Showing 9 of 9 rows

Other info

Follow for update

@wizwand_team Discord