RepBERT: Contextualized Text Embeddings for First-Stage Retrieval
About
Although exact term match between queries and documents is the dominant method to perform first-stage retrieval, we propose a different approach, called RepBERT, to represent documents and queries with fixed-length contextualized embeddings. The inner products of query and document embeddings are regarded as relevance scores. On MS MARCO Passage Ranking task, RepBERT achieves state-of-the-art results among all initial retrieval techniques. And its efficiency is comparable to bag-of-words methods.
Jingtao Zhan, Jiaxin Mao, Yiqun Liu, Min Zhang, Shaoping Ma• 2020
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Information Retrieval | ClueWeb 500K | nDCG@526.24 | 21 | |
| Information Retrieval | Gov 500K | nDCG@50.3101 | 21 | |
| Document Retrieval | NQ (test) | Hits@150.2 | 18 | |
| Passage retrieval | MS MARCO Passage Ranking 7k queries (dev) | MRR@1030.4 | 11 |
Showing 4 of 4 rows