Vision Graph Prompting via Semantic Low-Rank Decomposition

About

Vision GNN (ViG) demonstrates superior performance by representing images as graph structures, providing a more natural way to capture irregular semantic patterns beyond traditional grid or sequence-based representations. To efficiently adapt ViG to downstream tasks, parameter-efficient fine-tuning techniques like visual prompting become increasingly essential. However, existing prompting methods are primarily designed for Transformer-based models, neglecting the rich topological relationships among nodes and edges in graph-based representations, limiting their capacity to model complex semantics. In this paper, we propose Vision Graph Prompting (VGP), a novel framework tailored for vision graph structures. Our core insight reveals that semantically connected components in the graph exhibit low-rank properties. Building on this observation, we introduce a semantic low-rank prompting method that decomposes low-rank semantic features and integrates them with prompts on vision graph topologies, capturing both global structural patterns and fine-grained semantic dependencies. Extensive experiments demonstrate our method significantly improves ViG's transfer performance on diverse downstream tasks, achieving results comparable to full fine-tuning while maintaining parameter efficiency. Our code is available at https://github.com/zhoujiahuan1991/ICML2025-VGP.

Zixiang Ai, Zichen Liu, Jiahuan Zhou• 2025

Related benchmarks

Task	Dataset	Result
Graph property prediction	BACE	ROC AUC83.62	111
Graph property prediction	Tox21	ROC-AUC0.8008	109
Graph property prediction	ClinTox	ROC-AUC74.82	102
Graph property prediction	ToxCast	ROC-AUC0.6825	95
Graph property prediction	MUV	ROC-AUC0.8264	95
Graph property prediction	SIDER	ROC AUC67.12	95
Graph Classification	BBBP	ROC-AUC70.17	18

Showing 7 of 7 rows

Other info

Follow for update

@wizwand_team Discord