Rec-R1: Bridging Generative Large Language Models and User-Centric Recommendation Systems via Reinforcement Learning

About

We propose Rec-R1, a general reinforcement learning framework that bridges large language models (LLMs) with recommendation systems through closed-loop optimization. Unlike prompting and supervised fine-tuning (SFT), Rec-R1 directly optimizes LLM generation using feedback from a fixed black-box recommendation model, without relying on synthetic SFT data from proprietary models such as GPT-4o. This avoids the substantial cost and effort required for data distillation. To verify the effectiveness of Rec-R1, we evaluate it on two representative tasks: product search and sequential recommendation. Experimental results demonstrate that Rec-R1 not only consistently outperforms prompting- and SFT-based methods, but also achieves significant gains over strong discriminative baselines, even when used with simple retrievers such as BM25. Moreover, Rec-R1 preserves the general-purpose capabilities of the LLM, unlike SFT, which often impairs instruction-following and reasoning. These findings suggest Rec-R1 as a promising foundation for continual task-specific adaptation without catastrophic forgetting.

Jiacheng Lin, Tian Wang, Kun Qian• 2025

Related benchmarks

Task	Dataset	Result
Recommendation	MovieLens 20M	Accuracy59.4	19
Recommendation	LFM-1K	Accuracy87.2	19
Recommendation	MovieLens 1M	Accuracy40.2	19
Recommendation	ExpBench-Rec Movie (test)	NDCG@1023.98	11
Recommendation	ExpBench Rec-Music (test)	NDCG@1020.5	11
Maximizing Interest	KuaiRec dense	N@557.2	9
Ranking	KuaiRec Explore New Topics (test)	N@573	8
Ranking	MovieLens 1M	NDCG@50.554	8
Ranking	MovieLens 1M Trend Promotion (test)	Hit Rate@560.7	8
Ranking	KuaiRec	NDCG@539.1	8

Showing 10 of 13 rows

Other info

Follow for update

@wizwand_team Discord