Data-driven Clustering and Merging of Adapters for On-device Large Language Models

About

On-device large language models commonly employ task-specific adapters (e.g., LoRAs) to deliver strong performance on downstream tasks. While storing all available adapters is impractical due to memory constraints, mobile devices typically have sufficient capacity to store a limited number of these parameters. This raises a critical challenge: how to select representative adapters that generalize well across multiple tasks - a problem that remains unexplored in existing literature. We propose a novel method D2C for adapter clustering that leverages minimal task-specific examples (e.g., 10 per task) and employs an iterative optimization process to refine cluster assignments. The adapters within each cluster are merged, creating multi-task adapters deployable on resource-constrained devices. Experimental results demonstrate that our method effectively boosts performance for considered storage budgets.

Ondrej Bohdal, Taha Ceritli, Mete Ozay, Jijoong Moon, Kyeng-Hun Lee, Hyeonmok Ko, Umberto Michieli• 2026

Related benchmarks

Task	Dataset	Result	Rank
Text Generation	Aggregate NLP Tasks (GEC, Smart Reply, Summarization, Tone Adjustment, QA) (test)	Average Score26		18

Showing 1 of 1 rows

Other info

Follow for update

@wizwand_team Discord