Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Hierarchical Refinement: Optimal Transport to Infinity and Beyond

About

Optimal transport (OT) has enjoyed great success in machine learning as a principled way to align datasets via a least-cost correspondence, driven in large part by the runtime efficiency of the Sinkhorn algorithm (Cuturi, 2013). However, Sinkhorn has quadratic space and time complexity in the number of points, limiting scalability to larger datasets. Low-rank OT achieves linear complexity, but by definition, cannot compute a one-to-one correspondence between points. When the optimal transport problem is an assignment problem between datasets then an optimal mapping, known as the Monge map, is guaranteed to be a bijection. In this setting, we show that the factors of an optimal low-rank coupling co-cluster each point with its image under the Monge map. We leverage this invariant to derive an algorithm, Hierarchical Refinement (HiRef), that dynamically constructs a multiscale partition of each dataset using low-rank OT subproblems, culminating in the bijective Monge map. Hierarchical Refinement runs in log-linear time and linear space, retaining the advantages of low-rank OT while overcoming its limited resolution. We demonstrate the advantages of Hierarchical Refinement on several datasets, including ones containing over a million points, scaling full-rank OT to problems previously beyond Sinkhorn's reach.

Peter Halmos, Julian Gold, Xinhao Liu, Benjamin J. Raphael• 2025

Related benchmarks

TaskDatasetResultRank
Transport Map Estimationhuman scRNA-seq Hesperadin sci-Plex (test)
Avg Sinkhorn Divergence (Dε)17.3
5
Transport Map Estimationsci-Plex human Belinostat (test)
Average Sinkhorn Divergence (Dε)19.9
5
Transport Map Estimationhuman scRNA-seq Dacinostat sci-Plex (test)
Average Sinkhorn Divergence (Dε)24
5
Transport Map Estimationhuman scRNA-seq Givinostat sci-Plex (test)
Average Sinkhorn Divergence (Dε)19.2
5
Transport Map Estimationhuman scRNA-seq Quisinostat sci-Plex (test)
Average Sinkhorn Divergence (Dε)22.8
5
Spatial gene expression imputationMERFISH mouse brain (target slice)
Slc17a7 Score81
3
Showing 6 of 6 rows

Other info

Follow for update