The Unbalanced Gromov Wasserstein Distance: Conic Formulation and Relaxation
About
Comparing metric measure spaces (i.e. a metric space endowed with aprobability distribution) is at the heart of many machine learning problems. The most popular distance between such metric measure spaces is theGromov-Wasserstein (GW) distance, which is the solution of a quadratic assignment problem. The GW distance is however limited to the comparison of metric measure spaces endowed with a probability distribution. To alleviate this issue, we introduce two Unbalanced Gromov-Wasserstein formulations: a distance and a more tractable upper-bounding relaxation.They both allow the comparison of metric spaces equipped with arbitrary positive measures up to isometries. The first formulation is a positive and definite divergence based on a relaxation of the mass conservation constraint using a novel type of quadratically-homogeneous divergence. This divergence works hand in hand with the entropic regularization approach which is popular to solve large scale optimal transport problems. We show that the underlying non-convex optimization problem can be efficiently tackled using a highly parallelizable and GPU-friendly iterative scheme. The second formulation is a distance between mm-spaces up to isometries based on a conic lifting. Lastly, we provide numerical experiments onsynthetic examples and domain adaptation data with a Positive-Unlabeled learning task to highlight the salient features of the unbalanced divergence and its potential applications in ML.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Subgraph Alignment | Synthetic 30% subgraph | Accuracy35.15 | 10 | |
| Subgraph Alignment | Synthetic 50% subgraph | Accuracy89.88 | 10 | |
| Subgraph Alignment | Proteins-2 subgraph vs subgraph, 90% overlap | Accuracy67.3 | 10 | |
| Subgraph Alignment | Enzymes 50% subgraph | Accuracy43.73 | 10 | |
| Subgraph Alignment | Synthetic 40% subgraph | Matching Accuracy79.61 | 10 | |
| Subgraph Alignment | Synthetic (20% subgraph) | Accuracy4.48 | 10 | |
| Subgraph Alignment | Proteins subgraph and entire graph 1 | Accuracy0.2572 | 10 | |
| Subgraph Alignment | Proteins (30% subgraph) | Accuracy14.32 | 10 | |
| Subgraph Alignment | Proteins 20% subgraph | Accuracy (20% Subgraph)11.75 | 10 | |
| Subgraph Alignment | Enzymes 20% subgraph | Accuracy10.4 | 10 |