Model soups need only one ingredient
About
Fine-tuning large pre-trained models on a target distribution often improves in-distribution (ID) accuracy, but at the cost of out-of-distribution (OOD) robustness as representations specialize to the fine-tuning data. Weight-space ensembling methods, such as Model Soups, mitigate this effect by averaging multiple checkpoints, but they are computationally prohibitive, requiring the training and storage of dozens of fine-tuned models. In this paper, we introduce MonoSoup, a simple, data-free, hyperparameter-free, post-hoc method that achieves a strong ID-OOD balance using only a single checkpoint. Our method applies Singular Value Decomposition (SVD) to each layer's update and decomposes it into high-energy directions that capture task-specific adaptation and low-energy directions that introduce noise but may still encode residual signals useful for robustness. MonoSoup then uses entropy-based effective rank to automatically re-weigh these components with layer-wise coefficients that account for the spectral and geometric structure of the model. Experiments on CLIP models fine-tuned on ImageNet and evaluated under natural distribution shifts, as well as on Qwen language models tested on mathematical reasoning and multiple-choice benchmarks, show that this plug-and-play approach is a practical and effective alternative to multi-checkpoint methods, retaining much of their benefits without their computational overhead.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Mathematical Reasoning | GSM8K | Accuracy56.9 | 983 | |
| Image Classification | ImageNet-1k 1.0 (test) | Top-1 Accuracy80.34 | 197 | |
| Multiple-choice Question Answering | MMLU-Pro | MMLU-Pro Overall Accuracy37.5 | 116 | |
| Multiple-choice Question Answering | SciQ | Accuracy95.3 | 74 | |
| Mathematical Reasoning | GSM8K Platinum | Accuracy59.4 | 37 | |
| Mathematical Reasoning | GSMPlus | Accuracy31.9 | 23 | |
| Image Classification | ImageNet OOD (Avg of V2, R, Sketch, A, and ObjectNet) 1.0 (test) | Top-1 Acc51.6 | 17 | |
| Image Classification | ImageNet In-distribution 1k | Accuracy85.57 | 4 | |
| Image Classification | Avg OOD (test) | Accuracy56.7 | 4 |