Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Direct Product Flow Matching: Decoupling Radial and Angular Dynamics for Few-Shot Adaptation

About

Recent flow matching (FM) methods improve the few-shot adaptation of vision-language models, by modeling cross-modal alignment as a continuous multi-step flow. In this paper, we argue that existing FM methods are inherently constrained by incompatible geometric priors on pre-trained cross-modal features, resulting in suboptimal adaptation performance. We first analyze these methods from a polar decomposition perspective (i.e., radial and angular sub-manifolds). Under this new geometric view, we identify three overlooked limitations in them: 1) Angular dynamics distortion: The radial-angular coupling induces non-uniform speed on the angular sub-manifold, leading to regression training difficulty and extra truncation errors. 2) Radial dynamics neglect: Feature normalization discards modality confidence, failing to distinguish out-of-distribution and in-distribution data, and abandoning crucial radial dynamics. 3) Context-agnostic unconditional flow: Dataset-specific information loss during pre-trained cross-modal feature extraction remains unrecovered. To resolve these issues, we propose warped product flow matching (WP-FM), a unified Riemannian framework that reformulates alignment on a warped product manifold. Within this framework, we derive direct product flow matching (DP-FM) by introducing a constant-warping metric, which yields a decoupled cylindrical manifold (i.e., direct product manifold). DP-FM enables independent radial evolution and constant-speed angular geodesic transport, effectively eliminating angular dynamics distortion while preserving radial consistency. Meanwhile, we incorporate classifier-free guidance by conditioning the flow on the pre-trained VLMs' hidden states to inject missing dataset-specific information. Extensive results across 11 benchmarks have demonstrated that DP-FM achieves a new state-of-the-art for multi-step few-shot adaptation.

Hongxu Chen, Yanghao Wang, Bowei Zhu, Hongxiang Li, Zhen Wang, Ziqi Jiang, Lin Li, Rui Liu, Long Chen• 2026

Related benchmarks

TaskDatasetResultRank
ClassificationCars
Accuracy88.8
492
Image ClassificationPets
Accuracy93.1
308
Image ClassificationCaltech
Accuracy96.6
129
Image ClassificationFood
Accuracy85.8
91
Image ClassificationSUN
Accuracy77.6
65
Image ClassificationAircraft
Accuracy59.2
58
Image ClassificationSAT
Accuracy0.931
56
Image ClassificationUCF
Accuracy88
45
Image ClassificationNet
Accuracy74.4
28
Showing 9 of 9 rows

Other info

Follow for update