Prefix-Tuning: Optimizing Continuous Prompts for Generation
About
Fine-tuning is the de facto way to leverage large pretrained language models to perform downstream tasks. However, it modifies all the language model parameters and therefore necessitates storing a full copy for each task. In this paper, we propose prefix-tuning, a lightweight alternative to fine-tuning for natural language generation tasks, which keeps language model parameters frozen, but optimizes a small continuous task-specific vector (called the prefix). Prefix-tuning draws inspiration from prompting, allowing subsequent tokens to attend to this prefix as if it were "virtual tokens". We apply prefix-tuning to GPT-2 for table-to-text generation and to BART for summarization. We find that by learning only 0.1\% of the parameters, prefix-tuning obtains comparable performance in the full data setting, outperforms fine-tuning in low-data settings, and extrapolates better to examples with topics unseen during training.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Mathematical Reasoning | GSM8K (test) | Accuracy74.8 | 797 | |
| Text-to-Image Retrieval | Flickr30K | R@159 | 460 | |
| Natural Language Understanding | GLUE | SST-296 | 452 | |
| Natural Language Understanding | GLUE (test) | SST-2 Accuracy52.5 | 416 | |
| Multi-turn Dialogue Evaluation | MT-Bench | Overall Score5.688 | 331 | |
| Text-to-Video Retrieval | MSR-VTT | Recall@136.8 | 313 | |
| Text Classification | AG-News | Accuracy79.08 | 248 | |
| Sentiment Analysis | IMDB (test) | Accuracy53.3 | 248 | |
| Commonsense Reasoning | Common Sense Reasoning Tasks | Avg Score68.4 | 241 | |
| Summarization | XSum (test) | ROUGE-220.93 | 231 |