SVFT: Parameter-Efficient Fine-Tuning with Singular Vectors
About
Popular parameter-efficient fine-tuning (PEFT) methods, such as LoRA and its variants, freeze pre-trained model weights \(W\) and inject learnable matrices \(\Delta W\). These \(\Delta W\) matrices are structured for efficient parameterization, often using techniques like low-rank approximations or scaling vectors. However, these methods typically show a performance gap compared to full fine-tuning. Although recent PEFT methods have narrowed this gap, they do so at the cost of additional learnable parameters. We propose SVFT, a simple approach that fundamentally differs from existing methods: the structure imposed on \(\Delta W\) depends on the specific weight matrix \(W\). Specifically, SVFT updates \(W\) as a sparse combination of outer products of its singular vectors, training only the coefficients (scales) of these sparse combinations. This approach allows fine-grained control over expressivity through the number of coefficients. Extensive experiments on language and vision benchmarks show that SVFT recovers up to 96% of full fine-tuning performance while training only 0.006 to 0.25% of parameters, outperforming existing methods that only recover up to 85% performance using 0.03 to 0.8% of the trainable parameter budget.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Mathematical Reasoning | GSM8K | Accuracy75.9 | 983 | |
| Mathematical Reasoning | GSM8K (test) | Accuracy76.81 | 751 | |
| Mathematical Reasoning | MATH | Accuracy24.22 | 535 | |
| Image Classification | Food-101 | Accuracy78.36 | 494 | |
| Image Classification | Flowers102 | Accuracy99.28 | 478 | |
| Mathematical Reasoning | MATH (test) | Overall Accuracy29.98 | 433 | |
| Natural Language Understanding | GLUE (test) | SST-2 Accuracy95.99 | 416 | |
| Image Classification | CIFAR100 | Accuracy87.26 | 331 | |
| Image Classification | RESISC45 | Accuracy79.7 | 263 | |
| Image Classification | Food101 (test) | Accuracy78.36 | 87 |