Decoupling Knowledge from Memorization: Retrieval-augmented Prompt Learning

About

Prompt learning approaches have made waves in natural language processing by inducing better few-shot performance while they still follow a parametric-based learning paradigm; the oblivion and rote memorization problems in learning may encounter unstable generalization issues. Specifically, vanilla prompt learning may struggle to utilize atypical instances by rote during fully-supervised training or overfit shallow patterns with low-shot data. To alleviate such limitations, we develop RetroPrompt with the motivation of decoupling knowledge from memorization to help the model strike a balance between generalization and memorization. In contrast with vanilla prompt learning, RetroPrompt constructs an open-book knowledge-store from training instances and implements a retrieval mechanism during the process of input, training and inference, thus equipping the model with the ability to retrieve related contexts from the training corpus as cues for enhancement. Extensive experiments demonstrate that RetroPrompt can obtain better performance in both few-shot and zero-shot settings. Besides, we further illustrate that our proposed RetroPrompt can yield better generalization abilities with new datasets. Detailed analysis of memorization indeed reveals RetroPrompt can reduce the reliance of language models on memorization; thus, improving generalization for downstream tasks. Code is available in https://github.com/zjunlp/PromptKG/tree/main/research/RetroPrompt.

Xiang Chen, Lei Li, Ningyu Zhang, Xiaozhuan Liang, Shumin Deng, Chuanqi Tan, Fei Huang, Luo Si, Huajun Chen• 2022

Related benchmarks

Task	Dataset	Result
Sentiment Analysis	SST-2 (test)	Accuracy91.4	144
Sentiment Classification	MR (test)	Accuracy88	142
Sentiment Classification	CR (test)	Mean Accuracy88.8	58
Natural Language Inference	RTE (test)	Accuracy67.3	52
Paraphrase Detection	QQP (test)	Accuracy74	51
Natural Language Inference	MNLI few-shot zero-shot	Accuracy71.1	16
Natural Language Inference	QNLI few-shot zero-shot	Accuracy71.6	16
Paraphrase Identification	QQP few-shot zero-shot	Accuracy74	16
Sentiment Analysis	SST-2 few-shot zero-shot	Accuracy93.9	16
Sentiment Analysis	MR few-shot zero-shot	Accuracy88	16

Showing 10 of 14 rows

Other info

Code

Follow for update

@wizwand_team Discord