Rotation-Aligned Key Channel Pruning for Efficient Vision-Language Model Inference
About
Vision-Language Models suffer severe KV cache pressure at inference, as a single image often encodes into thousands of tokens. Most existing methods exploit token sparsity through token pruning, but permanently discarding visual content causes substantial degradation on fine-grained perception tasks. This motivates a complementary axis, feature sparsity: under a fixed KV cache budget, compressing the channel dimension preserves more visual tokens at the same memory cost. Prior Key channel pruning methods, however, face a structural trade-off: token-wise channel pruning is expressive but unstructured and slow, while head-wise approach is hardware-friendly but less robust. We resolve this with RotateK, a rotation-based structured Key channel pruning framework. RotateK applies an online PCA-based rotation that aligns token-dependent channel importance into a shared low-dimensional subspace, enabling accurate pruning under lightweight head-wise masks; a fused Triton attention kernel operates directly on sparse-channel Keys for efficient decoding. Experiments on two representative VLM backbones show that RotateK consistently outperforms prior Key channel pruning in both accuracy and decoding latency, while joint token-channel pruning improves over token-only baselines at matched KV cache budgets.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Visual Question Answering | VizWiz | Accuracy70.32 | 1820 | |
| Visual Question Answering | ChartQA | Accuracy79.56 | 519 | |
| Visual Question Answering | TextVQA | TextVQA Accuracy82.38 | 210 | |
| Visual Question Answering | DocVQA | Accuracy91.29 | 205 | |
| Visual Question Answering | InfoVQA | Accuracy70.51 | 195 | |
| Visual Question Answering | Visual Question Answering Evaluation Suite TVQA, InfoVQA, ChartQA, DocVQA, VizWiz | TVQA Accuracy82.38 | 26 | |
| Open-ended generation | LLaVA-Bench In-the-Wild | Score106.9 | 14 | |
| Open-ended generation | MM-Vet | MM-Vet Score44.04 | 14 |