Very fast, approximate counterfactual explanations for decision forests

About

We consider finding a counterfactual explanation for a classification or regression forest, such as a random forest. This requires solving an optimization problem to find the closest input instance to a given instance for which the forest outputs a desired value. Finding an exact solution has a cost that is exponential on the number of leaves in the forest. We propose a simple but very effective approach: we constrain the optimization to only those input space regions defined by the forest that are populated by actual data points. The problem reduces to a form of nearest-neighbor search using a certain distance on a certain dataset. This has two advantages: first, the solution can be found very quickly, scaling to large forests and high-dimensional data, and enabling interactive use. Second, the solution found is more likely to be realistic in that it is guided towards high-density areas of input space.

Miguel \'A. Carreira-Perpi\~n\'an, Suryabhan Singh Hada• 2023

Related benchmarks

Task	Dataset	Result
Counterfactual Explanations	Breast-Cancer (BC)	T02.9	4
Counterfactual Explanations	PD	T06	4
Counterfactual Explanations	COMPAS CP	T01.4	4
Counterfactual Explanations	FI	T021.4	4

Showing 4 of 4 rows

Other info

Follow for update

@wizwand_team Discord