Toward Robust In-Context Learning: Leveraging Out-of-distribution Proxies for Target Inaccessible Demonstration Retrieval

About

Although studies have demonstrated that Large Language Models (LLMs) can perform well on Out-of-Distribution (OOD) tasks, their advantage tends to diminish as the distribution shift becomes more severe. Consequently, researchers aim to retrieve distributionally similar and informative demonstrations from the available source domain to boost the inference capabilities of LLMs. However, in practical scenarios where the target domain is inaccessible, evaluating the unknown distribution is challenging, which indirectly impacts the quality of the selected demonstrations. To address this problem, we propose \textbf{DOPA}, a demonstration search framework that incorporates an OOD proxy to approximate the inaccessible target domain and guide the retrieval process. Building on proxy-based evaluation, DOPA further introduces a Mahalanobis distance-based global diversity constraint to ensure sufficient diversity among the retrieved demonstrations. Experimental results on multiple LLMs and tasks demonstrate that DOPA effectively enhances robustness in OOD settings\footnote{https://github.com/bort64/ood\_code}.

Hao Xu, Rite Bo, Fausto Giunchiglia, Yingji Li, Rui Song• 2026

Related benchmarks

Task	Dataset	Result
Natural Language Inference	aNLI	Accuracy38.1	107
Toxicity Detection	Toxigen	Score79	95
Sentiment Analysis	SST	Accuracy78.35	75
Question Answering	NewsQA	F161.49	67
Toxicity Detection	Implicit Hate	Accuracy62.87	52
Sentiment Analysis	SemEval	Score63.32	46
Question Answering	SearchQA	EM40.95	46
Natural Language Inference	WANLI	Accuracy (WANLI)45.4	42
Sentiment Analysis	DynaSent	Accuracy70.9	42
Natural Language Inference	CNLI	Accuracy46.1	42

Showing 10 of 17 rows

Other info

Follow for update

@wizwand_team Discord