Counterfactual Explanations Under Concept Drift

About

Counterfactual explanations (CFEs) provide actionable recourse, but most methods assume a static framework with fixed data and a trained classifier. This assumption breaks in evolving data environments, such as data streams, where online models are repeatedly updated under concept drift. We identify CFE maintenance in this setting as a previously overlooked problem: explanations that are valid when generated may silently become invalid as the model evolves, including robust CFEs, which are not designed for continuous drift. We propose a lightweight, model-agnostic update scheme that repairs existing CFEs using local sampling to estimate validity and plausibility directions while preserving proximity to the original instance. Experiments on synthetic drifting streams show that initially created CFEs rapidly lose validity, whereas maintained CFEs preserve validity and local plausibility at a lower cost than repeated regeneration.

Marcin Kostrzewa, Jerzy Stefanowski, Maciej Zi\k{e}ba• 2026

Related benchmarks

Task	Dataset	Result
Counterfactual Explanations	Hyperplane (Hyp.) (final-checkpoint)	Validation Score1	12
Counterfactual Explanations	Sine (final-checkpoint)	Validation Score100	12
Counterfactual Explanations	SEA (final-checkpoint)	Validation Score100	12

Showing 3 of 3 rows

Other info

Follow for update

@wizwand_team Discord