PromptBridge: Cross-Model Prompt Transfer for Large Language Models

About

Large language models (LLMs) underpin applications in code generation, mathematical reasoning, and agent-based workflows. In practice, systems access LLMs via commercial APIs or open-source deployments, and the model landscape (e.g., GPT, Claude, Llama) evolves rapidly. This rapid evolution forces frequent model switches driven by capability, cost, deployment constraints, and privacy. Yet prompts are highly model-sensitive: reusing a prompt engineered for one model on another often yields substantially worse performance than a prompt optimized for the target model. We term this phenomenon Model Drifting. Through extensive empirical analysis across diverse LLM configurations, we show that model drifting is both common and severe. To address this challenge, we introduce PromptBridge, a training-free framework that preserves prompt effectiveness under model switches, enabling cross-model prompt transfer without costly per-task or per-model re-optimization. PromptBridge requires only a small set of alignment tasks for calibration. It first applies Model-Adaptive Reflective Prompt Evolution (MAP-RPE) to obtain task- and model-specific optimal prompts via iterative reflective refinement and quantitative evaluation. Using the resulting calibrated prompt pairs for the source and target models, PromptBridge learns a cross-model prompt mapping. At test time, i.e., for an unseen task, given a source-model prompt, this mapping directly produces an optimized prompt for the target model. Experiments in single-agent and multi-agent settings show that PromptBridge consistently improves downstream accuracy while reducing migration effort. The code will be available soon.

Yaxuan Wang, Quan Liu, Zhenting Wang, Zichao Li, Wei Wei, Yang Liu, Yujia Bao• 2025

Related benchmarks

Task	Dataset	Result
Code Generation	HumanEval	Pass@198.37	1048
Code Generation	HumanEval (test)	--	701
Code Generation	MBPP	Pass@180.6	193
Code Generation	MBPP	Accuracy78.59	165
Code Generation	APPS	Pass@138	111
Code Generation	CodeContests	Pass@158.79	68
Code Generation	xCodeEval	Pass@177.67	36
Software Engineering Issue Solving	SWE-bench Verified	Accuracy46	15
Agent	Terminal-Bench	Accuracy18.75	12
Sole Planning	TravelPlanner (val)	Final Pass Rate7.22	8

Showing 10 of 11 rows

Other info

Follow for update

@wizwand_team Discord