In-context Example Selection with Influences
About
In-context learning (ICL) is a powerful paradigm emerged from large language models (LLMs). Despite its promises, ICL performance is known to be highly sensitive to input examples. In this work, we use $\textit{in-context influences}$ to analyze few-shot ICL performance directly from the in-context examples. Our proposed influence-based example selection method can identify both positive and negative examples, outperforming several baselines when evaluated on 9 SuperGLUE tasks. Our analysis uncovers up to a $16.3\%$ performance gap between using the most negative in-context examples compared to the most positive. In a case study, we apply our influence-based framework to quantify the phenomena of recency bias in example ordering for few-shot ICL.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Readmission prediction | MIMIC IV | AUC-ROC0.5285 | 70 | |
| Mortality Prediction | MIMIC-III | AUROC61.73 | 46 | |
| Readmission Prediction (RA) | MIMIC-IV (test) | ROC AUC0.5045 | 33 | |
| Length-of-Stay Prediction | MIMIC-III | Macro ROC AUC55.57 | 28 | |
| Mortality Prediction | MIMIC-III (test) | AUROC62.45 | 14 | |
| Length of Stay (LOS) prediction | MIMIC-III (test) | Macro ROC AUC54.24 | 14 | |
| Medical Reasoning | CMB | Exact Match (EM)83.5 | 12 | |
| Medical Reasoning | MedQA | EM53.02 | 12 | |
| Medical Reasoning | CMB clin | BLEU-123.88 | 12 |