In-Context Learning with Iterative Demonstration Selection

About

Spurred by advancements in scale, large language models (LLMs) have demonstrated strong few-shot learning ability via in-context learning (ICL). However, the performance of ICL has been shown to be highly sensitive to the selection of few-shot demonstrations. Selecting the most suitable examples as context remains an ongoing challenge and an open problem. Existing literature has highlighted the importance of selecting examples that are diverse or semantically similar to the test sample while ignoring the fact that the optimal selection dimension, i.e., diversity or similarity, is task-specific. Based on how the test sample is answered, we propose Iterative Demonstration Selection (IDS) to leverage the merits of both dimensions. Using zero-shot chain-of-thought reasoning (Zero-shot-CoT), IDS iteratively selects examples that are diverse but still strongly correlated with the test sample as ICL demonstrations. Specifically, IDS applies Zero-shot-CoT to the test sample before demonstration selection. The output reasoning path is then used to choose demonstrations that are prepended to the test sample for inference. The generated answer is followed by its corresponding reasoning path for extracting a new set of demonstrations in the next iteration. After several iterations, IDS adopts majority voting to obtain the final result. Through extensive experiments on tasks including reasoning, question answering, and topic classification, we demonstrate that IDS can consistently outperform existing ICL demonstration selection methods.

Chengwei Qin, Aston Zhang, Chen Chen, Anirudh Dagar, Wenming Ye• 2023

Related benchmarks

Task	Dataset	Result
Readmission prediction	MIMIC IV	AUC-ROC0.4741	74
Mortality Prediction	MIMIC-III	AUROC59.89	50
Readmission Prediction (RA)	MIMIC-IV (test)	ROC AUC0.4693	37
Length-of-Stay Prediction	MIMIC-III	Macro ROC AUC49.86	28
Length of Stay (LOS) prediction	MIMIC-III (test)	Macro ROC AUC52.47	14
Mortality Prediction	MIMIC-III (test)	AUROC56.72	14
Medical Reasoning	CMB	Exact Match (EM)79.57	12
Medical Reasoning	MedQA	EM52.59	12
Medical Reasoning	CMB clin	BLEU-122.63	12
Volatility Forecasting	S&P500	MAE1.22	8

Showing 10 of 12 rows

Other info

Follow for update

@wizwand_team Discord