In-Context Learning Demonstration Selection via Influence Analysis

About

Large Language Models (LLMs) have showcased their In-Context Learning (ICL) capabilities, enabling few-shot learning without the need for gradient updates. Despite its advantages, the effectiveness of ICL heavily depends on the choice of demonstrations. Selecting the most effective demonstrations for ICL remains a significant research challenge. To tackle this issue, we propose a demonstration selection method named InfICL, which utilizes influence functions to analyze impacts of training samples. By identifying the most influential training samples as demonstrations, InfICL aims to enhance the ICL generalization performance. To keep InfICL cost-effective, we only use the LLM to generate sample input embeddings, avoiding expensive fine-tuning. Through empirical studies on various real-world datasets, we demonstrate advantages of InfICL compared to state-of-the-art baselines.

Vinay M.S., Minh-Hao Van, Xintao Wu• 2024

Related benchmarks

Task	Dataset	Result
Natural Language Inference	aNLI	Accuracy37.23	107
Toxicity Detection	Toxigen	Score65.97	95
Sentiment Analysis	SST	Accuracy75.35	75
Toxicity Detection	Implicit Hate	Accuracy59	52
Sentiment Analysis	SemEval	Score64.04	46
Toxicity Detection	Adv	Accuracy59.74	42
Sentiment Analysis	DynaSent	Accuracy70.71	42
Natural Language Inference	CNLI	Accuracy45.29	42
Natural Language Inference	WANLI	Accuracy (WANLI)40.87	42
Toxicity Detection	Toxicity Detection (TD) suite implicit adv toxigen	Implicit Score60	10

Showing 10 of 12 rows

Other info

Follow for update

@wizwand_team Discord