SGPT: GPT Sentence Embeddings for Semantic Search
About
Decoder transformers have continued increasing in scale reaching hundreds of billions of parameters. Due to their scale the same decoder sets state-of-the-art results on various language tasks via prompting or fine-tuning. Yet, these large foundation models remain unusable for the related fields of semantic search and sentence embeddings. This prevents possibly new state-of-the-art results and forces organizations to train and maintain separate models. To this end, we propose SGPT to use decoders for sentence embeddings and semantic search via prompting or fine-tuning. At 5.8 billion parameters SGPT improves on the previously best sentence embeddings by a margin of 7% and outperforms a concurrent method with 175 billion parameters as measured on the BEIR search benchmark. Code, models and result files are freely available at https://github.com/Muennighoff/sgpt.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Semantic Textual Similarity | STS tasks (STS12, STS13, STS14, STS15, STS16, STS-B, SICK-R) various (test) | STS12 Score72.27 | 393 | |
| Information Retrieval | BEIR (test) | TREC-COVID Score0.807 | 76 | |
| Reranking | MS MARCO (dev) | -- | 71 | |
| Semantic Textual Similarity | STS-B | Spearman's Rho (x100)84.7 | 70 | |
| Information Retrieval | BEIR | TREC-COVID0.873 | 59 | |
| Information Retrieval | BEIR v1.0.0 (test) | ArguAna51.4 | 55 | |
| Text Embedding | MTEB | MTEB Score58.93 | 45 | |
| Semantic Textual Similarity (STS) | MTEB English 2023 (test) | BIO79.5 | 19 | |
| Sentence-level retrieval | ReQA SQuAD (test) | MRR0.783 | 13 | |
| Sentence-level retrieval | ReQA NQ (test) | MRR65.2 | 13 |