Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Inverse Classification for Comparison-based Interpretability in Machine Learning

About

In the context of post-hoc interpretability, this paper addresses the task of explaining the prediction of a classifier, considering the case where no information is available, neither on the classifier itself, nor on the processed data (neither the training nor the test data). It proposes an instance-based approach whose principle consists in determining the minimal changes needed to alter a prediction: given a data point whose classification must be explained, the proposed method consists in identifying a close neighbour classified differently, where the closeness definition integrates a sparsity constraint. This principle is implemented using observation generation in the Growing Spheres algorithm. Experimental results on two datasets illustrate the relevance of the proposed approach that can be used to gain knowledge about the classifier.

Thibault Laugel, Marie-Jeanne Lesot, Christophe Marsala, Xavier Renard, Marcin Detyniecki• 2017

Related benchmarks

TaskDatasetResultRank
Counterfactual ExplanationsCOMPAS
Validity64.4
21
Counterfactual ExplanationsCancer
Validity48.6
15
Counterfactual ExplanationsDiabetes
Validity49.8
15
Counterfactual ExplanationsFICO
Validity49.2
15
Counterfactual ExplanationsHousing
Validity49.4
15
Counterfactual ExplanationsBank
Validity48.8
14
Counterfactual ExplanationsTitanic
Validity0.499
14
Counterfactual ExplanationsChurn
Validity48.5
12
Counterfactual ExplanationsHyperplane (Hyp.) (final-checkpoint)
Validation Score1
12
Counterfactual ExplanationsSine (final-checkpoint)
Validation Score100
12
Showing 10 of 11 rows

Other info

Follow for update