Dr.ICL: Demonstration-Retrieved In-context Learning

About

In-context learning (ICL), teaching a large language model (LLM) to perform a task with few-shot demonstrations rather than adjusting the model parameters, has emerged as a strong paradigm for using LLMs. While early studies primarily used a fixed or random set of demonstrations for all test queries, recent research suggests that retrieving semantically similar demonstrations to the input from a pool of available demonstrations results in better performance. This work expands the applicability of retrieval-based ICL approaches by demonstrating that even simple word-overlap similarity measures such as BM25 outperform randomly selected demonstrations. Furthermore, we extend the success of retrieval-based ICL to instruction-finetuned LLMs as well as Chain-of-Thought (CoT) prompting. For instruction-finetuned LLMs, we find that although a model has already seen the training data at training time, retrieving demonstrations from the training data at test time yields better results compared to using no demonstrations or random demonstrations. Last but not least, we train a task-specific demonstration retriever that outperforms off-the-shelf retrievers.

Man Luo, Xin Xu, Zhuyun Dai, Panupong Pasupat, Mehran Kazemi, Chitta Baral, Vaiva Imbrasaite, Vincent Y Zhao• 2023

Related benchmarks

Task	Dataset	Result
Natural Language Inference	aNLI	Accuracy36.1	107
Toxicity Detection	Toxigen	Score77.43	95
Sentiment Analysis	SST	Accuracy78.26	75
Toxicity Detection	Implicit Hate	Accuracy63.1	52
Sentiment Analysis	SemEval	Score63.82	46
Natural Language Inference	CNLI	Accuracy47.2	42
Natural Language Inference	WANLI	Accuracy (WANLI)42.77	42
Sentiment Analysis	DynaSent	Accuracy68.4	42
Toxicity Detection	Adv	Accuracy57.04	42
Named Entity Recognition	wnut	Accuracy56.03	40

Showing 10 of 11 rows

Other info

Follow for update

@wizwand_team Discord