Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

IG2: Integrated Gradient on Iterative Gradient Path for Feature Attribution

About

Feature attribution explains Artificial Intelligence (AI) at the instance level by providing importance scores of input features' contributions to model prediction. Integrated Gradients (IG) is a prominent path attribution method for deep neural networks, involving the integration of gradients along a path from the explained input (explicand) to a counterfactual instance (baseline). Current IG variants primarily focus on the gradient of explicand's output. However, our research indicates that the gradient of the counterfactual output significantly affects feature attribution as well. To achieve this, we propose Iterative Gradient path Integrated Gradients (IG2), considering both gradients. IG2 incorporates the counterfactual gradient iteratively into the integration path, generating a novel path (GradPath) and a novel baseline (GradCF). These two novel IG components effectively address the issues of attribution noise and arbitrary baseline choice in earlier IG methods. IG2, as a path method, satisfies many desirable axioms, which are theoretically justified in the paper. Experimental results on XAI benchmark, ImageNet, MNIST, TREC questions answering, wafer-map failure patterns, and CelebA face attributes validate that IG2 delivers superior feature attributions compared to the state-of-the-art techniques. The code is released at: https://github.com/JoeZhuo-ZY/IG2.

Yue Zhuo, Zhiqiang Ge• 2024

Related benchmarks

TaskDatasetResultRank
Attribution FaithfulnessOxford-IIIT Pet
Insertion AUC0.6832
34
AttributionImageNet 2012
DiffID Score30.47
30
AttributionOxford 102 Flower
DiffID21.89
30
Attribution LocalizationManometry
MSE4.76
22
Attribution LocalizationChest X-ray
MSE1.55
22
Attribution LocalizationBrain MRI
MSE3.17
22
Showing 6 of 6 rows

Other info

Follow for update