Z-ICL: Zero-Shot In-Context Learning with Pseudo-Demonstrations
About
Although large language models can be prompted for both zero- and few-shot learning, performance drops significantly when no demonstrations are available. In this paper, we introduce Z-ICL, a new zero-shot method that closes the gap by constructing pseudo-demonstrations for a given test input using a raw text corpus. Concretely, pseudo-demonstrations are constructed by (1) finding the nearest neighbors to the test input from the corpus and pairing them with random task labels, and (2) applying a set of techniques to reduce the amount of direct copying the model does from the resulting demonstrations. Evaluation on nine classification datasets shows that Z-ICL outperforms previous zero-shot methods by a significant margin, and is on par with in-context learning with labeled training data in the few-shot setting. Overall, Z-ICL provides a significantly higher estimate of the zero-shot performance levels of a model, and supports future efforts to develop better pseudo-demonstrations that further improve zero-shot results.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Sentiment Classification | SST2 (test) | Accuracy87.8 | 214 | |
| Sentiment Analysis | SST-5 (test) | Accuracy38.7 | 173 | |
| Sentiment Classification | MR (test) | Accuracy84 | 142 | |
| Sentiment Classification | CR (test) | Mean Accuracy91.4 | 58 | |
| Sentiment Classification | Yelp (test) | Accuracy96 | 46 | |
| Sentiment Classification | Yelp5 (test) | Accuracy97.7 | 34 | |
| Sentiment Classification | Amz5 (test) | Accuracy93 | 34 | |
| Sentiment Classification | Tweet (test) | Accuracy46.8 | 34 | |
| Sentiment Classification | Amz (test) | Accuracy94.9 | 25 |