Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

An Analysis of Fusion Functions for Hybrid Retrieval

About

We study hybrid search in text retrieval where lexical and semantic search are fused together with the intuition that the two are complementary in how they model relevance. In particular, we examine fusion by a convex combination (CC) of lexical and semantic scores, as well as the Reciprocal Rank Fusion (RRF) method, and identify their advantages and potential pitfalls. Contrary to existing studies, we find RRF to be sensitive to its parameters; that the learning of a CC fusion is generally agnostic to the choice of score normalization; that CC outperforms RRF in in-domain and out-of-domain settings; and finally, that CC is sample efficient, requiring only a small set of training examples to tune its only parameter to a target domain.

Sebastian Bruch, Siyu Gai, Amir Ingber• 2022

Related benchmarks

TaskDatasetResultRank
Visual document retrievalViDoRe V2--
36
Document RetrievalViDoRe V1
Arxiv Score88
23
Hybrid RetrievalViDoRe 1
Avg Performance Score93.5
18
Hybrid RetrievalVidore 2
Average NDCG@562.1
9
RetrievalVidore 2
Recall@5 (Avg)58.5
9
Hybrid RetrievalViDoRe 3
Avg NDCG@5 Gain (%)2.66
5
Showing 6 of 6 rows

Other info

Follow for update