Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Structured State-Space Regularization for Generation-Friendly Image Tokenization

About

Image tokenizers play a central role in modern generative models, where the structure of the latent space critically determines the downstream generation performance. A key but underexplored property of effective latent representations is spectral organization, the ability to encode information across frequency components. In this work, we introduce structured state-space regularization, a principled approach to inducing spectral structure in latent spaces. We derive a regularization objective by revisiting state-space models (SSMs) as systems mimicking a basis function's behavior. This perspective reveals that hidden states of SSMs are induced to capture the frequency components, resulting in a novel regularizer that enforces the latent space to capture spectral structure of images. Experiments demonstrate that our regularizer improves the generative performance of image tokenizers while incurring only minimal loss in their reconstruction fidelity.

Jinsung Lee, Jaemin Oh, Namhun Kim, Dongwon Kim, Byung-Jun Yoon, Suha Kwak• 2026

Related benchmarks

TaskDatasetResultRank
Image ReconstructionImageNet-1K 1.0 (val)
rFID0.91
35
Image GenerationImageNet-1K 1.0 (val)
FID7.29
17
Showing 2 of 2 rows

Other info

Follow for update