LoCO: Low-rank Compositional Rotation Fine-tuning
About
Parameter-efficient fine-tuning (PEFT) has emerged as an critical technique for adapting large-scale foundation models across natural language processing and computer vision. While existing methods such as low-rank adaptations achieve parameter efficiency via low-rank weight updates, they are limited in their ability to preserve the geometric structure of pretrained representations. We introduce Low-rank Compositional Orthogonal fine-tuning (LoCO), a novel PEFT method that constructs orthogonal transformations through low-rank skew-symmetric matrices and compositional rotation chains. We propose an approximation scheme that enables fully parallel computation of compositional rotations, making the approach practical for high-dimensional feature spaces. Our method maintains low computational complexity while maintaining orthogonality with controlled approximation error. We validate LoCO across diverse domains, including diffusion transformer fine-tuning, vision transformer adaptation, and language model adaptation. Our method demonstrates superior or competitive performance compared to both existing orthogonal and non-orthogonal methods.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Image Classification | VTAB 1k (test) | -- | 121 | |
| Mathematical Reasoning | GSM8K (val) | Accuracy50.19 | 108 | |
| Mathematical Reasoning | MATH (val) | Accuracy8.4 | 59 | |
| Natural Language Understanding | GLUE (val) | CoLA Score65.56 | 26 | |
| Controllable Image Generation | Canny Controllable Generation Benchmark | F1 Score49 | 3 | |
| Controllable Image Generation | Depth Controllable Generation Benchmark | Mean Squared Error (MSE)723 | 3 | |
| Controllable Image Generation | Mask Controllable Generation Benchmark | MSE6.93e+3 | 3 | |
| Controllable Image Generation | Colorization Controllable Generation Benchmark | MSE106 | 3 | |
| Controllable Image Generation | Deblur Controllable Generation Benchmark | MSE83 | 3 |