Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Pre-Training to Learn in Context

About

In-context learning, where pre-trained language models learn to perform tasks from task examples and instructions in their contexts, has attracted much attention in the NLP community. However, the ability of in-context learning is not fully exploited because language models are not explicitly trained to learn in context. To this end, we propose PICL (Pre-training for In-Context Learning), a framework to enhance the language models' in-context learning ability by pre-training the model on a large collection of "intrinsic tasks" in the general plain-text corpus using the simple language modeling objective. PICL encourages the model to infer and perform tasks by conditioning on the contexts while maintaining task generalization of pre-trained models. We evaluate the in-context learning performance of the model trained with PICL on seven widely-used text classification datasets and the Super-NaturalInstrctions benchmark, which contains 100+ NLP tasks formulated to text generation. Our experiments show that PICL is more effective and task-generalizable than a range of baselines, outperforming larger language models with nearly 4x parameters. The code is publicly available at https://github.com/thu-coai/PICL.

Yuxian Gu, Li Dong, Furu Wei, Minlie Huang• 2023

Related benchmarks

TaskDatasetResultRank
Natural Language InferenceRTE
Accuracy54
367
Subjectivity ClassificationSubj
Accuracy72.5
266
Sentiment ClassificationSST-2
Accuracy86.9
174
Topic ClassificationAG-News
Accuracy67.5
173
Sentiment ClassificationMR
Accuracy83.6
148
Sentiment ClassificationSST-5
Accuracy38
31
Natural Language InferenceCB
Average Accuracy70
29
Instruction FollowingSuper-Natural Instructions (test)
ROUGE-L37.6
21
Text ClassificationSST2, SUBJ, MR, RTE, AgNews, CB, SST5 (test)
SST2 Accuracy79.7
14
Showing 9 of 9 rows

Other info

Code

Follow for update