Self-Guided Contrastive Learning for BERT Sentence Representations
About
Although BERT and its variants have reshaped the NLP landscape, it still remains unclear how best to derive sentence embeddings from such pre-trained Transformers. In this work, we propose a contrastive learning method that utilizes self-guidance for improving the quality of BERT sentence representations. Our method fine-tunes BERT in a self-supervised fashion, does not rely on data augmentation, and enables the usual [CLS] token embeddings to function as sentence vectors. Moreover, we redesign the contrastive learning objective (NT-Xent) and apply it to sentence representation learning. We demonstrate with extensive experiments that our approach is more effective than competitive baselines on diverse sentence-related tasks. We also show it is efficient at inference and robust to domain shifts.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Semantic Textual Similarity | STS tasks (STS12, STS13, STS14, STS15, STS16, STS-B, SICK-R) various (test) | STS12 Score75.16 | 393 | |
| Semantic Textual Similarity | STS tasks (STS12, STS13, STS14, STS15, STS16, STS-B, SICK-R) | STS12 Score66.84 | 195 | |
| Sentence Representation Evaluation | SentEval (test) | MR Accuracy86.03 | 28 | |
| Semantic Textual Similarity | SemEval Task 1 Spanish 2017 (Track 3) | Pearson R (x100)80.19 | 8 | |
| Semantic Textual Similarity | SemEval Task 1 English Track 5 2017 | Pearson Correlation (R)0.7824 | 8 | |
| Semantic Textual Similarity | STS SemEval-2017 Task 1 (test) | Pearson Correlation0.5852 | 8 | |
| Semantic Textual Similarity | SemEval Task 10 Spanish 2014 | STS Score (Spanish)82.74 | 7 |