Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

HyperAlign: Hypernetwork for Efficient Test-Time Alignment of Diffusion Models

About

Diffusion model alignment aims to bridge the gap between generated outputs and human preferences by enhancing both semantic consistency with textual prompts and overall visual quality. Existing alignment methods face a challenging trade-off: test-time approaches enable input-specific adaptability but introduce significant computational overhead and tend to under-optimize, while fine-tuning approaches risk reward over-optimization and loss of generation diversity. To bridge this gap, we propose HyperAlign, a framework that trains a hypernetwork for efficient and effective test-time alignment. Instead of modifying latent states directly, HyperAlign dynamically generates input-and-state-conditioned low-rank adaptation weights to modulate the denoising trajectory toward target rewards. We introduce multiple HyperAlign variants of varying granularity to balance alignment quality and computational efficiency. The hypernetwork is optimized with a reward objective regularized by preference data to mitigate reward hacking. We evaluate HyperAlign across multiple generative paradigms, including Stable Diffusion and FLUX, where it significantly outperforms existing alignment methods in semantic consistency and visual quality.

Xin Xie, Jiaxian Guo, Dong Gong• 2026

Related benchmarks

TaskDatasetResultRank
Text-to-Image GenerationGenEval
Overall Score70
506
Text-to-Image GenerationGenEval 1.0 (test)
Overall Score52
85
Text-to-Image GenerationPick-a-Pic 1K prompts v1
ImageReward1.28
20
Showing 3 of 3 rows

Other info

Follow for update