Learning Important Features Through Propagating Activation Differences

About

The purported "black box" nature of neural networks is a barrier to adoption in applications where interpretability is essential. Here we present DeepLIFT (Deep Learning Important FeaTures), a method for decomposing the output prediction of a neural network on a specific input by backpropagating the contributions of all neurons in the network to every feature of the input. DeepLIFT compares the activation of each neuron to its 'reference activation' and assigns contribution scores according to the difference. By optionally giving separate consideration to positive and negative contributions, DeepLIFT can also reveal dependencies which are missed by other approaches. Scores can be computed efficiently in a single backward pass. We apply DeepLIFT to models trained on MNIST and simulated genomic data, and show significant advantages over gradient-based methods. Video tutorial: http://goo.gl/qKb7pL, ICML slides: bit.ly/deeplifticmlslides, ICML talk: https://vimeo.com/238275076, code: http://goo.gl/RM8jvH.

Avanti Shrikumar, Peyton Greenside, Anshul Kundaje• 2017

Related benchmarks

Task	Dataset	Result
Localization	ImageNet-1k (val)	--	79
Faithfulness Evaluation	TellMeWhy	AUC π-Soft-NS0.313	67
Faithfulness Evaluation	WikiBio	AUC π-Soft-NS0.348	67
Feature Attribution Plausibility	MDACE (test)	P32.4	65
Explainable AI Attribution	ImageNet random subset of 1000 images	ARCC0.54	60
Feature Relevance Evaluation	ImageNet (test)	R (Feature Relevance)0.48	60
Faithfulness Evaluation	MDACE (test)	Comp Score77	40
Feature Interaction Attribution	Dyck-2 15,000 corpus size (test)	Average Relative Ranks (ARR)0.542	34
Feature Attribution	Twitter	Comprehensiveness75	33
Feature Attribution	SciFact	Comprehensiveness62	33

Showing 10 of 105 rows

...

Other info

Follow for update

@wizwand_team Discord