Generalizable Multimodal Large Language Model Editing via Invariant Trajectory Learning

About

Knowledge editing emerges as a crucial technique for efficiently correcting incorrect or outdated knowledge in large language models (LLM). Existing editing methods rely on a rigid mapping from parameter or module modifications to output, which causes the generalization limitation in Multimodal LLM (MLLM). In this paper, we reformulate MLLM editing as an out-of-distribution (OOD) generalization problem, where the goal is to discern semantic shift with factual shift and thus achieve robust editing among diverse cross-modal prompting. The key challenge of this OOD problem lies in identifying invariant causal trajectories that generalize accurately while suppressing spurious correlations. To address it, we propose ODEdit, a plug-and-play invariant learning based framework that optimizes the tripartite OOD risk objective to simultaneously enhance editing reliability, locality, and generality.We further introduce an edit trajectory invariant learning method, which integrates a total variation penalty into the risk minimization objective to stabilize edit trajectories against environmental variations. Theoretical analysis and extensive experiments demonstrate the effectiveness of ODEdit.

Jiajie Su, Haoyuan Wang, Xiaohua Feng, Yunshan Ma, Xiaobo Xia, Yuyuan Li, Xiaolin Zheng, Jianmao Xiao, Chaochao Chen• 2026

Related benchmarks

Task	Dataset	Result
Knowledge Editing	E-VQA MMEdit 1.0 (test)	Reliability100	24
Knowledge Editing	MMEdit E-IC 1.0 (test)	Reliability100	24
Sequential Knowledge Editing	Editing VQA (E-VQA) T=5 steps	Relational Accuracy (Rel.)92.63	4
Sequential Knowledge Editing	Editing VQA (E-VQA) T=10 steps	Rel. Score89.79	4
Sequential Knowledge Editing	Editing Image Caption (E-IC) T=5 steps	Relational Score86.53	4
Sequential Knowledge Editing	Editing Image Caption (E-IC) T=10 steps	Relational Score84.64	4

Showing 6 of 6 rows

Other info

Follow for update

@wizwand_team Discord