Learning Modality Knowledge Alignment for Cross-Modality Transfer

About

Cross-modality transfer aims to leverage large pretrained models to complete tasks that may not belong to the modality of pretraining data. Existing works achieve certain success in extending classical finetuning to cross-modal scenarios, yet we still lack understanding about the influence of modality gap on the transfer. In this work, a series of experiments focusing on the source representation quality during transfer are conducted, revealing the connection between larger modality gap and lesser knowledge reuse which means ineffective transfer. We then formalize the gap as the knowledge misalignment between modalities using conditional distribution P(Y|X). Towards this problem, we present Modality kNowledge Alignment (MoNA), a meta-learning approach that learns target data transformation to reduce the modality knowledge discrepancy ahead of the transfer. Experiments show that out method enables better reuse of source modality knowledge in cross-modality transfer, which leads to improvements upon existing finetuning methods.

Wenxuan Ma, Shuang Li, Lincan Cai, Jingxuan Kang• 2024

Related benchmarks

Task	Dataset	Result
PDE solving	PDEBench Diff.Sorp (test)	nRMSE0.0016	65
PDE solving	PDEBench Diff.Reac 1D (test)	nRMSE0.0028	41
Cross-modal adaptation	NAS-Bench-360	Darcy (Relative L2)0.0068	9
PDE solving	PDEBench Advection (test)	nRMSE0.0088	9
Diverse Prediction Tasks	NAS-Bench-360 (test)	Darcy Score0.0068	9
PDE solving	PDEBench Diffusion Reaction (1D)	nRMSE0.0028	8
PDE solving	PDEBench Darcy (test)	nRMSE0.079	8
Darcy	PDEBench	nRMSE0.079	5
PDE solving	PDEBench Diffusion Sorption (1D)	nRMSE0.0016	5
Aggregate Performance Ranking	PDEBench Multiple Tasks	Avg Rank1.875	5

Showing 10 of 19 rows

Other info

Follow for update

@wizwand_team Discord