Topology-Aware Layer Pruning for Large Vision-Language Models

About

Large Language Models (LLMs) have demonstrated strong capabilities in natural language understanding and reasoning, while recent extensions that incorporate visual inputs enable them to process multimodal information. Despite these advances, Large Vision-Language Models (LVLMs) incur substantial computational and memory costs, hindering deployment in resource-constrained scenarios. Existing layer pruning methods typically rely on local similarity metrics or static proxy signals, failing to capture the global and dynamic evolution of representations across model depth, which often leads to the removal of transition-critical layers. To address this limitation, we propose a topology-aware layer pruning framework for LVLMs. Specifically, we represent layer wise hidden states as point clouds and models their evolution using \textit{simplicial complexes}. By leveraging \textit{zigzag persistent homology}, we quantify inter-layer topological consistency and enable adaptive pruning that preserves critical representational transitions. Extensive experiments on diverse multimodal benchmarks demonstrate that the proposed framework consistently outperforms existing pruning methods across a wide range of sparsity ratios. Our code is available at https://github.com/zpc456/TopoVLM.

Pengcheng Zheng, Chaoning Zhang, Ya Wen, Wang Liu, Qigan Sun, Jiarong Mo, Jiaquan Zhang, Jewon Lee, Tae-Ho Kim, Kuien Liu, Tianyu Li, Caiyan Qin, Yang Yang• 2026

Related benchmarks

Task	Dataset	Result
Diagram Question Answering	AI2D	AI2D Accuracy65.6	509
Multi-discipline Multimodal Understanding	MMMU	--	422
Chart Question Answering	ChartQA	--	404
Mathematical Reasoning	MathVista	Accuracy56.3	382
Visual Perception	BLINK	--	255
Video Understanding	EgoSchema	EgoSchema Score55.4	185
Multi-modal Video Understanding	MVBench	Score77.8	84
Multimodal Benchmarking	MMBench	MMBench Score68.1	60
Video Question Answering	NextQA MC	Score72.5	44
Multi-modal Evaluation	MME	MME Perception Score1.51e+3	43

Showing 10 of 25 rows

Other info

Follow for update

@wizwand_team Discord