Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Model Fusion via Retrofitting

About

Model fusion seeks to combine independently trained neural networks into a single model without retraining, but is complicated by representational divergence arising from permutation invariance, random initialization, and heterogeneous training data. Existing methods struggle particularly in zero-shot settings under non-IID data distributions, and are often limited to specific architectures or pairwise fusion. We introduce a neuron-centric family of fusion algorithms that frames fusion as a principled representation-matching problem: intermediate neurons across parent models are grouped into target representations, which the fused model's corresponding sub-networks are then trained to approximate. Unlike prior work, our approach incorporates neuron attribution scores to bias alignment toward salient features, and can be applied to any architecture modularizable as a DAG of levels -- empirically validated on VGGs, ResNets, and ViTs. Experiments across standard benchmarks show consistent improvements over existing fusion methods, with the largest gains in zero-shot and non-IID scenarios. Code is available at https://github.com/AndrewSpano/model-fusion-via-retrofitting.

Phoomraphee Luenam, Andreas Spanopoulos, Amit Sant, Thomas Hofmann, Sotiris Anagnostidis, Sidak Pal Singh• 2025

Related benchmarks

TaskDatasetResultRank
Image ClassificationTiny ImageNet (test)
Accuracy54.2
722
Image ClassificationCIFAR-100 (test)
Accuracy75.6
63
Image ClassificationCIFAR-10 non-IID s=2
Test Accuracy86.6
33
Image ClassificationCIFAR-10 (2-way sharded split)
Accuracy80.9
24
Image ClassificationCIFAR-100 6-way Sharded (test)
Test Accuracy34.7
19
Image ClassificationCIFAR10 Non-IID (4-Class partition)
Accuracy79.7
19
Image ClassificationCIFAR-100 2-way Sharded (test)
Test Accuracy54.5
18
Image ClassificationTiny-ImageNet (Sharded 2-way split)
Accuracy32.5
18
Image ClassificationCIFAR-100 4-way Sharded (test)
Accuracy41.4
17
Image ClassificationCIFAR-10 4-way sharded
Accuracy56.4
14
Showing 10 of 17 rows

Other info

Follow for update