Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Wasserstein Distances Made Explainable: Insights Into Dataset Shifts and Transport Phenomena

About

Wasserstein distances provide a powerful framework for comparing data distributions. They can be used to analyze processes over time or to detect inhomogeneities within data. However, simply calculating the Wasserstein distance or analyzing the corresponding transport plan (or coupling) may not be sufficient for understanding what factors contribute to a high or low Wasserstein distance. In this work, we propose a novel solution based on Explainable AI that allows us to efficiently and accurately attribute Wasserstein distances to various data components, including data subgroups, input features, or interpretable subspaces. Our method achieves high accuracy across diverse datasets and Wasserstein distance specifications, and its practical utility is demonstrated in three use cases.

Philip Naumann, Jacob Kauffmann, Gr\'egoire Montavon• 2025

Related benchmarks

TaskDatasetResultRank
Shift AttributionAir Quality 1h shift
Cosine Similarity0.81
8
Shift AttributionAir Quality 2h shift
Cosine Similarity0.87
8
Shift AttributionAir Quality (3h shift)
Cosine Similarity0.89
8
Shift AttributionAir Quality (5h shift)
Cosine Similarity0.91
8
Shift AttributionAir Quality (4h shift)
Cosine Similarity0.88
8
Shift AttributionAir Quality (6h shift)
Cosine Similarity0.91
8
Shift AttributionAppliances 1h shift
Cosine similarity0.46
8
Shift AttributionAppliances 2h shift
Cosine Similarity0.45
8
Shift AttributionAppliances 3h shift
Cosine Similarity0.45
8
Shift AttributionAppliances (4h shift)
Cosine Similarity0.45
8
Showing 10 of 12 rows

Other info

Follow for update