Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Minibatch Optimal Transport and Perplexity Bound Estimation in Discrete Flow Matching

About

Discrete flow matching, a recent framework for modeling categorical data, has shown competitive performance with autoregressive models. However, unlike continuous flow matching, the rectification strategy cannot be applied due to the stochasticity of discrete paths, necessitating alternative methods to minimize state transitions. We propose a dynamic-optimal-transport-like minimization objective and derive its Kantorovich formulation for discrete flows with convex interpolants, where transport cost depends solely on inter-state similarity and can be optimized via minibatch strategies. We show that such methods can reduce the number of transitions up to 32 times (1024 to 32) to reach the same generative perplexity without compromising diversity. Additionally, path nondeterminism in discrete flows precludes an instantaneous change-of-variables analogue, preventing precise probability estimation available to continuous flows. We therefore propose two upper bounds on perplexity, enabling principled training, evaluation and model comparison. Finally, we introduce Multimask Flows which outperform masked flows in generative perplexity without compromising diversity, particularly when utilizing minibatch Optimal Transport.

Etrit Haxholli, Yeti Z. Gurbuz, Ogul Can, Eli Waxman• 2024

Related benchmarks

TaskDatasetResultRank
Language ModelingWikiText-2
Perplexity (PPL)42
841
Language ModelingPTB
Perplexity111.6
650
Language ModelingWikiText-103
PPL41.64
146
Language ModelingLAMBADA
Perplexity53.19
99
Text GenerationOpenWebText
Perplexity132.6
66
Language ModelingLM1B
Perplexity77.87
7
Showing 6 of 6 rows

Other info

Follow for update