INTERLACE: Interleaved Layer Pruning and Efficient Adaptation in Large Vision-Language Models

About

We introduce INTERLACE, a novel framework that prunes redundant layers in VLMs while maintaining performance through sample-efficient finetuning. Existing layer pruning methods lead to significant performance drop when applied to VLMs. Instead, we analyze triplets of consecutive layers to identify local redundancy, removing the most redundant of the first two layers, finetune the remaining layer to compensate for the lost capacity, and freeze the third layer to serve as a stable anchor during finetuning. We found that this interleaved finetune-freeze design enables rapid convergence with minimal data after pruning. By finetuning only a subset of layers on just 1% of the FineVision dataset for one epoch, Interlace achieves 88.9% average performance retention after dropping 25% of the network, achieving SOTA performance. Our code is available at: https://github.com/pmadinei/Interlace.git

Parsa Madinei, Ryan Solgi, Ziqi Wen, Jonathan Skaza, Miguel Eckstein, Ramtin Pedarsani• 2025

Related benchmarks

Task	Dataset	Result
Science Question Answering	ScienceQA	--	791
Chart Question Answering	ChartQA	--	371
Real-world Question Answering	RealworldQA	Overall Score61.7	58
Real-world Visual Understanding	RealworldQA	Score64.97	39
General Vision-Language Understanding	LLaVA-OneVision	Score63.14	36
Mathematical Reasoning	Snapask	Accuracy28.82	36
Mathematical Reasoning	NuminaMath	Math Accuracy47.99	18
Visual Search and Reasoning	VSTAR	Score68.59	18
Fine-grained Visual Perception	VSTAR	VStar Score73.82	18

Showing 9 of 9 rows

Other info

Follow for update

@wizwand_team Discord