Fine-Tuning Pre-trained Language Model with Weak Supervision: A Contrastive-Regularized Self-Training Approach

About

Fine-tuned pre-trained language models (LMs) have achieved enormous success in many natural language processing (NLP) tasks, but they still require excessive labeled data in the fine-tuning stage. We study the problem of fine-tuning pre-trained LMs using only weak supervision, without any labeled data. This problem is challenging because the high capacity of LMs makes them prone to overfitting the noisy labels generated by weak supervision. To address this problem, we develop a contrastive self-training framework, COSINE, to enable fine-tuning LMs with weak supervision. Underpinned by contrastive regularization and confidence-based reweighting, this contrastive self-training framework can gradually improve model fitting while effectively suppressing error propagation. Experiments on sequence, token, and sentence pair classification tasks show that our model outperforms the strongest baseline by large margins on 7 benchmarks in 6 tasks, and achieves competitive performance with fully-supervised fine-tuning methods.

Yue Yu, Simiao Zuo, Haoming Jiang, Wendi Ren, Tuo Zhao, Chao Zhang• 2020

Related benchmarks

Task	Dataset	Result
Text Classification	AG News (test)	Accuracy88	293
Question Classification	TREC	Accuracy82.59	262
Text Classification	AGNews	Accuracy87.52	119
Sentiment Classification	IMDB	Accuracy90.54	73
Relation Extraction	TACRED v1.0 (test)	F1 Score41	37
Word Sense Disambiguation	WiC (test)	Accuracy85.3	34
Word Sense Disambiguation	WiC (dev)	Accuracy89.5	32
Sentiment Classification	Yelp	Accuracy95.97	24
Relation Classification	ChemProt	Accuracy54.36	13
Slot Filling	MIT-R	Accuracy76.61	13

Showing 10 of 11 rows

Other info

Code

Follow for update

@wizwand_team Discord