Efficient Learning With Sine-Activated Low-rank Matrices

About

Low-rank decomposition has emerged as a vital tool for enhancing parameter efficiency in neural network architectures, gaining traction across diverse applications in machine learning. These techniques significantly lower the number of parameters, striking a balance between compactness and performance. However, a common challenge has been the compromise between parameter efficiency and the accuracy of the model, where reduced parameters often lead to diminished accuracy compared to their full-rank counterparts. In this work, we propose a novel theoretical framework that integrates a sinusoidal function within the low-rank decomposition process. This approach not only preserves the benefits of the parameter efficiency characteristic of low-rank methods but also increases the decomposition's rank, thereby enhancing model performance. Our method proves to be a plug in enhancement for existing low-rank models, as evidenced by its successful application in Vision Transformers (ViT), Large Language Models (LLMs), Neural Radiance Fields (NeRF) and 3D shape modelling.

Yiping Ji, Hemanth Saratchandran, Cameron Gordon, Zeyu Zhang, Simon Lucey• 2024

Related benchmarks

Task	Dataset	Result
Commonsense Reasoning	Commonsense 8 Sub-Tasks	Accuracy (8 Sub-Tasks)85.42	80
Natural Language Understanding	GLUE	SST-2 Accuracy95.6	62
Language Modeling	Pubmed	Perplexity6.45	59
Language Modeling	LAMBADA	PPL Change (%)5.5	41
Language Modeling	WT-103	Perplexity11.28	20
Instruction Tuning	MT-Bench	Score6.22	16
Language Modeling	OpenR1	Perplexity (PPL)3.39	11
Language Modeling	WikiText-103	Perplexity10.43	9

Showing 8 of 8 rows

Other info

Follow for update

@wizwand_team Discord