RankT5: Fine-Tuning T5 for Text Ranking with Ranking Losses

About

Recently, substantial progress has been made in text ranking based on pretrained language models such as BERT. However, there are limited studies on how to leverage more powerful sequence-to-sequence models such as T5. Existing attempts usually formulate text ranking as classification and rely on postprocessing to obtain a ranked list. In this paper, we propose RankT5 and study two T5-based ranking model structures, an encoder-decoder and an encoder-only one, so that they not only can directly output ranking scores for each query-document pair, but also can be fine-tuned with "pairwise" or "listwise" ranking losses to optimize ranking performances. Our experiments show that the proposed models with ranking losses can achieve substantial ranking performance gains on different public text ranking data sets. Moreover, when fine-tuned with listwise ranking losses, the ranking model appears to have better zero-shot ranking performance on out-of-domain data sets compared to the model fine-tuned with classification losses.

Honglei Zhuang, Zhen Qin, Rolf Jagerman, Kai Hui, Ji Ma, Jing Lu, Jianmo Ni, Xuanhui Wang, Michael Bendersky• 2022

Related benchmarks

Task	Dataset	Result
Passage Reranking	BRIGHT	NDCG@10 (Avg)16.6	54
Information Retrieval	Scientific QA Base setting	HitRate@146.9	38
Question Answering	Scientific QA Base setting	F1 Score40.61	38
Reranking	BEIR	NQ NDCG@50.5097	35
Reranking	TREC	NDCG@5 (DL19)71.66	35
Reranking	SciRAG-SSLI easy 1.0 (test)	Hit Rate @ 155.4	19
Scientific Question Answering	SciRAG-SSLI hard 1.0 (test)	F1 Score45.48	19
Reranking	SciRAG-SSLI hard 1.0 (test)	Hit Rate @ 152.6	19
Scientific Question Answering	SciRAG-SSLI easy 1.0 (test)	F1 Score44.81	19
Passage Reranking	BEIR	nDCG@10 (COVID)80.71	18

Showing 10 of 18 rows

Other info

Follow for update

@wizwand_team Discord