Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Not Just a Black Box: Learning Important Features Through Propagating Activation Differences

About

Note: This paper describes an older version of DeepLIFT. See https://arxiv.org/abs/1704.02685 for the newer version. Original abstract follows: The purported "black box" nature of neural networks is a barrier to adoption in applications where interpretability is essential. Here we present DeepLIFT (Learning Important FeaTures), an efficient and effective method for computing importance scores in a neural network. DeepLIFT compares the activation of each neuron to its 'reference activation' and assigns contribution scores according to the difference. We apply DeepLIFT to models trained on natural images and genomic data, and show significant advantages over gradient-based methods.

Avanti Shrikumar, Peyton Greenside, Anna Shcherbina, Anshul Kundaje• 2016

Related benchmarks

TaskDatasetResultRank
ExplainabilityImageNet (val)
Insertion36.3
104
Attribution FidelityImageNet 1,000 images (val)
µFidelity0.157
48
DeletionImageNet 2,000 images (val)
Deletion Score0.14
48
Feature AttributionRotten Tomatoes fine-tuned
LO-0.121
18
Feature AttributionIMDB (test)
LO-0.0892
18
Feature AttributionSST2
LO-0.199
18
Explanation FaithfulnessIMDB Review 1,000 sentences (val)
Word Deletion Score68.2
14
Feature AttributionSynthetic half-moons dataset with Gaussian noises (std dev 0.05-0.65)
AUC-Purity0.328
10
Feature AttributionPascal VOC (test)
AUC-Comp0.21
8
Showing 9 of 9 rows

Other info

Follow for update