LLM-I2I: Boost Your Small Item2Item Recommendation Model with Large Language Model

About

Item-to-Item (I2I) recommendation models are widely used in real-world systems due to their scalability, real-time capabilities, and high recommendation quality. Research to enhance I2I performance focuses on two directions: 1) model-centric approaches, which adopt deeper architectures but risk increased computational costs and deployment complexity, and 2) data-centric methods, which refine training data without altering models, offering cost-effectiveness but struggling with data sparsity and noise. To address these challenges, we propose LLM-I2I, a data-centric framework leveraging Large Language Models (LLMs) to mitigate data quality issues. LLM-I2I includes (1) an LLM-based generator that synthesizes user-item interactions for long-tail items, alleviating data sparsity, and (2) an LLM-based discriminator that filters noisy interactions from real and synthetic data. The refined data is then fused to train I2I models. Evaluated on industry (AEDS) and academic (ARD) datasets, LLM-I2I consistently improves recommendation accuracy, particularly for long-tail items. Deployed on a large-scale cross-border e-commerce platform, it boosts recall number (RN) by 6.02% and gross merchandise value (GMV) by 1.22% over existing I2I models. This work highlights the potential of LLMs in enhancing data-centric recommendation systems without modifying model architectures.

Yinfu Feng, Yanjing Wu, Rong Xiao, Xiaoyi Zen• 2025

Related benchmarks

Task	Dataset	Result
Recommendation	Beauty	NDCG@52.59	48
Sequential Recommendation	Amazon Sports and Outdoors	Recall@52.8	21
Item Recommendation	ARD Toys And Games	Recall@50.0522	12
Recommendation	AEDS	Recall@55.81	8
Product Recommendation	AliExpress Production Search System (online)	RN6.02	1

Showing 5 of 5 rows

Other info

Follow for update

@wizwand_team Discord