Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

On Pruning State-Space LLMs

About

Recent work proposed state-space models (SSMs) as an efficient alternative to transformer-based LLMs. Can these models be pruned to further reduce their computation costs? We adapt several pruning methods to the SSM structure, and apply them to four SSM-based LLMs across multiple tasks. We find that such models are quite robust to some pruning methods (e.g. WANDA), while using other methods lead to fast performance degradation.

Tamer Ghattas, Michael Hassid, Roy Schwartz• 2025

Related benchmarks

TaskDatasetResultRank
Language ModelingWikiText-2
Perplexity (PPL)13.18
1624
Question AnsweringARC-E
Accuracy63.85
416
Physical Interaction Question AnsweringPIQA
Accuracy72.91
333
Language ModelingLAMBADA
Accuracy59.23
268
Question AnsweringARC-C
Accuracy29.69
192
Showing 5 of 5 rows

Other info

Follow for update