Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

i-WiViG: Interpretable Window Vision GNN

About

Vision graph neural networks have emerged as a popular approach for modeling the global and spatial context for image recognition. However, a significant drawback of these methods is that they do not offer an inherent interpretation of the relevant spatial interactions for their prediction. We address this problem by introducing i-WiViG, an approach that enables interpretable model reasoning based on a sparse subgraph in the image. i-WiViG is based on two key postulates: 1) constraining the graph nodes' receptive field to disjoint local windows in the image, and 2) an inherently interpretable graph bottleneck with learnable sparse attention that identifies the relevant interactions among the local image windows. We evaluate our approach on both scene classification and regression tasks using natural and remote sensing imagery. Our results, supported by quantitative and qualitative evidence, demonstrate that the method delivers semantic, intuitive, and faithful explanations through the identified subgraphs. Furthermore, extensive experiments confirm that it achieves competitive performance to its black-box counterparts, even on datasets exhibiting strong texture bias. The implementation is available on https://github.com/zhu-xlab/i-WiViG.

Ivica Obadic, Dmitry Kangin, Adrian H\"ohl, Dario Oliveira, Plamen P Angelov, Xiao Xiang Zhu• 2025

Related benchmarks

TaskDatasetResultRank
Image ClassificationSUN397 (test)
Top-1 Accuracy58
231
Scene recognitionSUN 397 (test)
Top-1 Accuracy58
35
Scene ClassificationRESISC-45 (test)
OA92
32
RegressionLiveability (test)
0.47
9
Showing 4 of 4 rows

Other info

Follow for update