ReasonRank: Empowering Passage Ranking with Strong Reasoning Ability

About

Large Language Model (LLM) based listwise ranking has shown superior performance in many passage ranking tasks. With the development of Large Reasoning Models (LRMs), many studies have demonstrated that step-by-step reasoning during test-time helps improve listwise ranking performance. However, due to the scarcity of reasoning-intensive training data, existing rerankers perform poorly in many complex ranking scenarios, and the ranking ability of reasoning-intensive rerankers remains largely underdeveloped. In this paper, we first propose an automated reasoning-intensive training data synthesis framework, which sources training queries and passages from diverse domains and applies DeepSeek-R1 to generate high-quality training labels. To empower the listwise reranker with strong reasoning ability, we further propose a two-stage training approach, which includes a cold-start supervised fine-tuning (SFT) stage and a reinforcement learning (RL) stage. During the RL stage, we design a novel multi-view ranking reward tailored to the multi-turn nature of listwise ranking. Extensive experiments demonstrate that our trained reasoning-intensive reranker \textbf{ReasonRank} outperforms existing baselines significantly and also achieves much lower latency than the pointwise reranker. Our codes are available at https://github.com/8421BCD/ReasonRank.

Wenhan Liu, Xinyu Ma, Weiwei Sun, Yutao Zhu, Yuchen Li, Dawei Yin, Zhicheng Dou• 2025

Related benchmarks

Task	Dataset	Result
Information Retrieval	BEIR	SciFact0.755	120
Information Retrieval	BRIGHT	--	94
Passage Reranking	BRIGHT	NDCG@10 (Avg)38.03	54
Information Retrieval	BRIGHT 1.0 (test)	nDCG@10 (Avg)26.5	35
Recommendation	HRT	ND@50.8046	26
Information Retrieval	BRIGHT v1 (test)	nDCG@10 (Avg)40.8	22
Long-context Memory Retrieval and Reasoning	LongMemEval 128K	F1 Score44.26	20
Long-context Memory Retrieval and Reasoning	LoCoMo 32K	F1 Score41.04	20
Long-context Memory Retrieval and Reasoning	LongMemEval 1M	F1 Score48.2	20
Long-context Memory Retrieval and Reasoning	WebWalker 128K	F1 Score25.72	20

Showing 10 of 42 rows

Other info

Follow for update

@wizwand_team Discord