Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

COIL: Revisit Exact Lexical Match in Information Retrieval with Contextualized Inverted List

About

Classical information retrieval systems such as BM25 rely on exact lexical match and carry out search efficiently with inverted list index. Recent neural IR models shifts towards soft semantic matching all query document terms, but they lose the computation efficiency of exact match systems. This paper presents COIL, a contextualized exact match retrieval architecture that brings semantic lexical matching. COIL scoring is based on overlapping query document tokens' contextualized representations. The new architecture stores contextualized token representations in inverted lists, bringing together the efficiency of exact match and the representation power of deep language models. Our experimental results show COIL outperforms classical lexical retrievers and state-of-the-art deep LM retrievers with similar or smaller latency.

Luyu Gao, Zhuyun Dai, Jamie Callan• 2021

Related benchmarks

TaskDatasetResultRank
Passage retrievalMsMARCO (dev)
MRR@1035.5
116
RetrievalMS MARCO (dev)
MRR@100.355
84
RetrievalTREC DL 2019
NDCG@1066
71
Passage RankingTREC DL 2019 (test)
NDCG@1070.4
33
Document RetrievalMS MARCO Document (dev)
MRR@1000.397
24
Document RetrievalMS MARCO document retrieval (DL'19)
NDCG@1063.6
10
Passage RerankingMS-MARCO passage ranking August 11, 2021 (test)
MRR@1042.7
3
Showing 7 of 7 rows

Other info

Follow for update