The Expando-Mono-Duo Design Pattern for Text Ranking with Pretrained Sequence-to-Sequence Models

About

We propose a design pattern for tackling text ranking problems, dubbed "Expando-Mono-Duo", that has been empirically validated for a number of ad hoc retrieval tasks in different domains. At the core, our design relies on pretrained sequence-to-sequence models within a standard multi-stage ranking architecture. "Expando" refers to the use of document expansion techniques to enrich keyword representations of texts prior to inverted indexing. "Mono" and "Duo" refer to components in a reranking pipeline based on a pointwise model and a pairwise model that rerank initial candidates retrieved using keyword search. We present experimental results from the MS MARCO passage and document ranking tasks, the TREC 2020 Deep Learning Track, and the TREC-COVID challenge that validate our design. In all these tasks, we achieve effectiveness that is at or near the state of the art, in some cases using a zero-shot approach that does not exploit any training data from the target task. To support replicability, implementations of our design pattern are open-sourced in the Pyserini IR toolkit and PyGaggle neural reranking library.

Ronak Pradeep, Rodrigo Nogueira, Jimmy Lin• 2021

Related benchmarks

Task	Dataset	Result
Document Ranking	TREC DL Track 2019 (test)	nDCG@1070.3	133
Document Ranking	TREC DL Track 2020 (test)	nDCG@100.666	63
Ranking	BEIR selected subset v1.0.0 (test)	TREC-COVID80.1	38
Information Retrieval	SciFact BEIR (test)	nDCG@1057.9	36
Information Retrieval	BEIR COVID v1 (test)	nDCG@1063.1	26
Multi-hop Question Answering	MuSiQue	Recall@134.3	22
Multi-hop Question Answering	HotpotQA	Recall@143.2	22
Multimodal Reranking	INQUIRE RERANK	nDCG@5068.7	11

Showing 8 of 8 rows

Other info

Follow for update

@wizwand_team Discord