Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Parameter Efficient Mamba Tuning via Projector-targeted Diagonal-centric Linear Transformation

About

Despite the growing interest in Mamba architecture as a potential replacement for Transformer architecture, parameter-efficient fine-tuning (PEFT) approaches for Mamba remain largely unexplored. In our study, we introduce two key insights-driven strategies for PEFT in Mamba architecture: (1) While state-space models (SSMs) have been regarded as the cornerstone of Mamba architecture, then expected to play a primary role in transfer learning, our findings reveal that Projectors -- not SSMs -- are the predominant contributors to transfer learning. (2) Based on our observation, we propose a novel PEFT method specialized to Mamba architecture: Projector-targeted Diagonal-centric Linear Transformation (ProDiaL). ProDiaL focuses on optimizing only the pretrained Projectors for new tasks through diagonal-centric linear transformation matrices, without directly fine-tuning the Projector weights. This targeted approach allows efficient task adaptation, utilizing less than 1% of the total parameters, and exhibits strong performance across both vision and language Mamba models, highlighting its versatility and effectiveness.

Seokil Ham, Hee-Seon Kim, Sangmin Woo, Changick Kim• 2024

Related benchmarks

TaskDatasetResultRank
Commonsense ReasoningHellaSwag
Accuracy38.57
1460
Image ClassificationStanfordCars
Accuracy85.38
266
Science Question AnsweringARC Challenge
Accuracy30.46
234
Commonsense ReasoningWinoGrande
Accuracy53.83
231
Science Question AnsweringARC Easy
Accuracy53.45
101
Image ClassificationCaltech
Accuracy97.16
98
Question AnsweringWinoGrande (WG)
Accuracy61.96
98
Image ClassificationFlowers
Accuracy88
83
Natural Language UnderstandingARC-C
Accuracy30.8
20
Natural Language UnderstandingARC Easy
Accuracy55.18
20
Showing 10 of 14 rows

Other info

Follow for update