LoFA: Learning to Predict Personalized Priors for Fast Adaptation of Visual Generative Models
About
Personalizing visual generative models to meet specific user needs has gained increasing attention, yet current methods like Low-Rank Adaptation (LoRA) remain impractical due to their demand for task-specific data and lengthy optimization. While a few hypernetwork-based approaches attempt to predict adaptation weights directly, they struggle to map fine-grained user prompts to complex LoRA distributions, limiting their practical applicability. To bridge this gap, we propose LoFA, a general framework that efficiently predicts personalized priors for fast model adaptation. We first identify a key property of LoRA: structured distribution patterns emerge in the relative changes between LoRA and base model parameters. Building on this, we design a two-stage hypernetwork: first predicting relative distribution patterns that capture key adaptation regions, then using these to guide final LoRA weight prediction. Extensive experiments demonstrate that our method consistently predicts high-quality personalized priors within seconds, across multiple tasks and user prompts, even outperforming conventional LoRA that requires hours of processing. Project page: https://jaeger416.github.io/lofa/.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Subject-consistent image generation | User Study | User Preference Score0.48 | 6 | |
| Video Generation | User Study | Preference Rate (Ours) (%)56.8 | 5 | |
| Identity-Personalized Image Generation | FFHQ (test) | Face Similarity0.548 | 4 | |
| Personalized Human Action Video Generation | MotionX and MotionX++ (val) | FVD589.8 | 4 | |
| Pose-conditioned Human Action Video Generation | User Study | Win Rate51.2 | 2 | |
| Text-to-Video Stylization | VBench | CSD Score0.427 | 2 | |
| Text-to-Video Stylization | User Study | Win Rate52.4 | 2 |