Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

From Distillation to Hard Negative Sampling: Making Sparse Neural IR Models More Effective

About

Neural retrievers based on dense representations combined with Approximate Nearest Neighbors search have recently received a lot of attention, owing their success to distillation and/or better sampling of examples for training -- while still relying on the same backbone architecture. In the meantime, sparse representation learning fueled by traditional inverted indexing techniques has seen a growing interest, inheriting from desirable IR priors such as explicit lexical matching. While some architectural variants have been proposed, a lesser effort has been put in the training of such models. In this work, we build on SPLADE -- a sparse expansion-based retriever -- and show to which extent it is able to benefit from the same training improvements as dense models, by studying the effect of distillation, hard-negative mining as well as the Pre-trained Language Model initialization. We furthermore study the link between effectiveness and efficiency, on in-domain and zero-shot settings, leading to state-of-the-art results in both scenarios for sufficiently expressive models.

Thibault Formal, Carlos Lassance, Benjamin Piwowarski, St\'ephane Clinchant• 2022

Related benchmarks

TaskDatasetResultRank
Document RankingTREC DL Track 2019 (test)
nDCG@1073.2
96
RetrievalMS MARCO (dev)
MRR@100.389
84
Information RetrievalBEIR (test)--
76
RetrievalTREC DL 2019
NDCG@1073
71
RerankingMS MARCO (dev)
MRR@100.38
71
Information RetrievalBEIR
TREC-COVID0.711
59
Information RetrievalMS Marco--
56
Information RetrievalMS MARCO DL2019
nDCG@1074.3
26
RetrievalBridge (test)
Hit@1080
25
Web Search RetrievalTREC DL 20
nDCG@1072.8
22
Showing 10 of 21 rows

Other info

Follow for update