Exploring Versatile Generative Language Model Via Parameter-Efficient Transfer Learning

About

Fine-tuning pre-trained generative language models to down-stream language generation tasks has shown promising results. However, this comes with the cost of having a single, large model for each task, which is not ideal in low-memory/power scenarios (e.g., mobile). In this paper, we propose an effective way to fine-tune multiple down-stream generation tasks simultaneously using a single, large pre-trained model. The experiments on five diverse language generation tasks show that by just using an additional 2-3% parameters for each task, our model can maintain or even improve the performance of fine-tuning the whole model.

Zhaojiang Lin, Andrea Madotto, Pascale Fung• 2020

Related benchmarks

Task	Dataset	Result
Natural Language Understanding	GLUE	SST-296.6	551
Natural language generation	E2E (test)	ROUGE-L89.48	100
Data-to-text generation	DART (test)	BLEU45.7	64
Natural language generation	E2E NLG Challenge	BLEU69.1	58
Data-to-text generation	E2E	ROUGE-L0.713	36
Table-to-text generation	DART	METEOR0.38	30
Natural language generation	WebNLG unseen categories	BLEU49.8	17
Table-to-text generation	WebNLG	BLEU (Seen)60.4	17
Natural language generation	WebNLG all categories	BLEU56	11
Natural language generation	WebNLG seen categories	BLEU61.1	11

Showing 10 of 14 rows

Other info

Follow for update

@wizwand_team Discord