Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

MaskDiME: Adaptive Masked Diffusion for Precise and Efficient Visual Counterfactual Explanations

About

Visual counterfactual explanations aim to reveal the minimal semantic modifications that can alter a model's prediction, providing causal and interpretable insights into deep neural networks. However, existing diffusion-based counterfactual generation methods are often computationally expensive, slow to sample, and imprecise in localizing the modified regions. To address these limitations, we propose MaskDiME, a simple, fast, and effective diffusion framework that unifies semantic consistency and spatial precision through localized sampling. Our approach adaptively focuses on decision-relevant regions to achieve localized and semantically consistent counterfactual generation while preserving high image fidelity. Our training-free framework, MaskDiME, achieves over 30x faster inference than the baseline method and achieves comparable or state-of-the-art performance across five benchmark datasets spanning diverse visual domains, establishing a practical and generalizable solution for efficient counterfactual explanation.

Changlu Guo, Anders Nymark Christensen, Anders Bjorholm Dahl, Morten Rieger Hannemose• 2026

Related benchmarks

TaskDatasetResultRank
Visual Counterfactual Explanation (Age)CelebA Standard
FID0.77
11
Visual Counterfactual Explanation (Smile)CelebA Standard
FID0.71
11
Counterfactual ExplanationImageNet Zebra - Sorrel
FID32.5
11
Counterfactual ExplanationImageNet (Cheetah - Cougar)
FID37.4
11
Counterfactual ExplanationImageNet Egyptian Cat - Persian Cat
FID36.2
11
Counterfactual Visual ExplanationBDD100K
FID3.19
10
Visual Counterfactual Explanation (Age)CelebA-HQ
FID4.43
9
Visual Counterfactual Explanation (Smile)CelebA-HQ
FID2.51
9
Counterfactual Visual ExplanationBDD-OIA
FID5.43
7
Showing 9 of 9 rows

Other info

Follow for update