Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Tail Annealing for Heavy-Tailed Flow Matching

About

Standard generative models struggle with heavy-tailed data: Lipschitz architectures cannot produce power-law tails from Gaussian noise, and interpolating between heavy-tailed data and Gaussians is ill-posed. We propose a simple fix: apply the soft-log transform $\phi(x) = \mathrm{sign}(x) \cdot \log(1 + |x|)$ coordinate-wise to data before training, then exponentiate samples after generation. A Hill diagnostic decides per-coordinate whether to transform, leaving light-tailed margins untouched at no added complexity. This compresses heavy tails into a range where standard flow matching succeeds, without heavy-tailed base distributions or architectural modifications. We provide theoretical intuition for why this works: the log-transform maps Pareto tails to exponentials, and the induced dynamics implement a form of tail annealing via power transformations. On a 144-configuration multivariate benchmark (3 copulas, $d$ up to 100, 4 tail indices), Log-FM dominates specialized baselines on $W_1$, CVaR$_{99}$, and extreme-quantile metrics, and is the only method with zero severe divergences across 2{,}880 runs.

Jean Pachebat• 2026

Related benchmarks

TaskDatasetResultRank
Heavy-tailed Flow MatchingGumbel + Gaussian copulas (test)
WP10.074
80
Distribution EstimationHickling Student-t benchmark original (test)
Wasserstein-1 distance0.14
30
Flow MatchingGumbel + Gaussian alpha=1.5
Catastrophic Failure Fraction (WP1 > 1)2
20
Flow MatchingGumbel + Gaussian (alpha=2.0)
Catastrophic Failure Rate0.00e+0
20
Generative ModelingGumbel + Gaussian Median across all configurations 480 values per cell
W1^P (Pareto Margins)0.187
20
Generative ModelingFama-French 5
W1 Distance0.133
5
Showing 6 of 6 rows

Other info

Follow for update