Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Self-Supervised Weight Templates for Scalable Vision Model Initialization

About

The increasing scale and complexity of modern model parameters underscore the importance of pre-trained models. However, deployment often demands architectures of varying sizes, exposing limitations of conventional pre-training and fine-tuning. To address this, we propose SWEET, a self-supervised framework that performs constraint-based pre-training to enable scalable initialization in vision tasks. Instead of pre-training a fixed-size model, we learn a shared weight template and size-specific weight scalers under Tucker-based factorization, which promotes modularity and supports flexible adaptation to architectures with varying depths and widths. Target models are subsequently initialized by composing and reweighting the template through lightweight weight scalers, whose parameters can be efficiently learned from minimal training data. To further enhance flexibility in width expansion, we introduce width-wise stochastic scaling, which regularizes the template along width-related dimensions and encourages robust, width-invariant representations for improved cross-width generalization. Extensive experiments on \textsc{classification}, \textsc{detection}, \textsc{segmentation} and \textsc{generation} tasks demonstrate the state-of-the-art performance of SWEET for initializing variable-sized vision models.

Yucheng Xie, Fu Feng, Ruixiao Shi, Jing Wang, Yong Rui, Xin Geng• 2026

Related benchmarks

TaskDatasetResultRank
Semantic segmentationADE20K
mIoU38.39
1024
Image ClassificationStanford Cars--
635
Image ClassificationCIFAR100
Accuracy79.36
347
Image ClassificationCUB-200
Accuracy61.51
106
Image ClassificationiNaturalist 2019
Top-1 Acc64.22
98
Image ClassificationOxford Flowers
Top-1 Accuracy83.12
83
Image ClassificationImageNet-1K
Top-1 Acc77.42
75
Image GenerationImageNet-1K
FID14.41
55
Image ClassificationFood101
Top-1 Accuracy82.5
33
Instance SegmentationCOCO
APmask35.16
30
Showing 10 of 11 rows

Other info

Follow for update