Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Decomposing Task Vectors for Refined Model Editing

About

Large pre-trained models have transformed machine learning, yet adapting these models effectively to exhibit precise, concept-specific behaviors remains a significant challenge. Task vectors, defined as the difference between fine-tuned and pre-trained model parameters, provide a mechanism for steering neural networks toward desired behaviors. This has given rise to large repositories dedicated to task vectors tailored for specific behaviors. The arithmetic operation of these task vectors allows for the seamless combination of desired behaviors without the need for large datasets. However, these vectors often contain overlapping concepts that can interfere with each other during arithmetic operations, leading to unpredictable outcomes. We propose a principled decomposition method that separates each task vector into two components: one capturing shared knowledge across multiple task vectors, and another isolating information unique to each specific task. By identifying invariant subspaces across projections, our approach enables more precise control over concept manipulation without unintended amplification or diminution of other behaviors. We demonstrate the effectiveness of our decomposition method across three domains: improving multi-task merging in image classification by 5% using shared components as additional task vectors, enabling clean style mixing in diffusion models without generation degradation by mixing only the unique components, and achieving 47% toxicity reduction in language models while preserving performance on general knowledge tasks by negating the toxic information isolated to the unique component. Our approach provides a new framework for understanding and controlling task vector arithmetic, addressing fundamental limitations in model editing operations.

Hamed Damirchi, Ehsan Abbasnejad, Zhen Zhang, Javen Shi• 2025

Related benchmarks

TaskDatasetResultRank
Language UnderstandingMMLU
Accuracy39.7
756
ReasoningBBH
Accuracy37.8
507
Image ClassificationEuroSAT
Accuracy94.7
497
Image ClassificationStanford Cars
Accuracy71.8
477
Image ClassificationDTD
Accuracy65.2
419
Image ClassificationSVHN
Accuracy86.4
359
Image ClassificationGTSRB
Accuracy95.9
291
Image ClassificationRESISC45
Accuracy83.7
263
Image ClassificationMNIST
Accuracy97.1
263
Image ClassificationSUN397
Accuracy65.1
246
Showing 10 of 13 rows

Other info

Follow for update