Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Differentiable Prompt Makes Pre-trained Language Models Better Few-shot Learners

About

Large-scale pre-trained language models have contributed significantly to natural language processing by demonstrating remarkable abilities as few-shot learners. However, their effectiveness depends mainly on scaling the model parameters and prompt design, hindering their implementation in most real-world applications. This study proposes a novel pluggable, extensible, and efficient approach named DifferentiAble pRompT (DART), which can convert small language models into better few-shot learners without any prompt engineering. The main principle behind this approach involves reformulating potential natural language processing tasks into the task of a pre-trained language model and differentially optimizing the prompt template as well as the target label with backpropagation. Furthermore, the proposed approach can be: (i) Plugged to any pre-trained language models; (ii) Extended to widespread classification tasks. A comprehensive evaluation of standard NLP tasks demonstrates that the proposed approach achieves a better few-shot performance. Code is available in https://github.com/zjunlp/DART.

Ningyu Zhang, Luoqiu Li, Xiang Chen, Shumin Deng, Zhen Bi, Chuanqi Tan, Fei Huang, Huajun Chen• 2021

Related benchmarks

TaskDatasetResultRank
Natural Language InferenceSNLI (test)
Accuracy75.8
681
Natural Language InferenceSNLI (train)
Accuracy89.5
154
Sentiment ClassificationMR (test)
Accuracy88.2
142
Sentiment AnalysisSST-2 (test)
Accuracy93.5
136
Subjectivity ClassificationSubj (test)
Accuracy90.7
125
Question ClassificationTREC (test)
Accuracy87.1
124
Sentiment AnalysisCR
Accuracy93.8
123
Text ClassificationIMDB (test)--
79
Sentiment ClassificationCR (test)
Mean Accuracy91.8
58
Relation ExtractionSemEval (test)
Micro F189.1
55
Showing 10 of 23 rows

Other info

Code

Follow for update