Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Supervised Contrastive Learning for Pre-trained Language Model Fine-tuning

About

State-of-the-art natural language understanding classification models follow two-stages: pre-training a large language model on an auxiliary task, and then fine-tuning the model on a task-specific labeled dataset using cross-entropy loss. However, the cross-entropy loss has several shortcomings that can lead to sub-optimal generalization and instability. Driven by the intuition that good generalization requires capturing the similarity between examples in one class and contrasting them with examples in other classes, we propose a supervised contrastive learning (SCL) objective for the fine-tuning stage. Combined with cross-entropy, our proposed SCL loss obtains significant improvements over a strong RoBERTa-Large baseline on multiple datasets of the GLUE benchmark in few-shot learning settings, without requiring specialized architecture, data augmentations, memory banks, or additional unsupervised data. Our proposed fine-tuning objective leads to models that are more robust to different levels of noise in the fine-tuning training data, and can generalize better to related tasks with limited labeled data.

Beliz Gunel, Jingfei Du, Alexis Conneau, Ves Stoyanov• 2020

Related benchmarks

TaskDatasetResultRank
Image ClassificationCIFAR-100
Top-1 Accuracy81.49
622
Image ClassificationDTD
Accuracy72.73
487
Image ClassificationCIFAR-10--
471
Image ClassificationAircraft
Accuracy87.44
302
Image ClassificationOxford-IIIT Pets
Accuracy89.71
259
Image ClassificationCaltech-101
Accuracy92.84
198
Image ClassificationFGVC Aircraft
Top-1 Accuracy87.44
185
Emotion Recognition in ConversationMELD
Weighted Avg F165.63
137
Conversational Emotion RecognitionIEMOCAP
Weighted Average F1 Score68.14
129
Image ClassificationFlowers
Top-1 Acc98.65
80
Showing 10 of 25 rows

Other info

Follow for update