Catalog-Native LLM: Speaking Item-ID Dialect with Less Entanglement for Recommendation

About

While collaborative filtering delivers predictive accuracy and efficiency, and Large Language Models (LLMs) enable expressive and generalizable reasoning, modern recommendation systems must bring these strengths together. Growing user expectations, such as natural-language queries and transparent explanations, further highlight the need for a unified approach. However, doing so is nontrivial. Collaborative signals are often token-efficient but semantically opaque, while LLMs are semantically rich but struggle to model implicit user preferences when trained only on textual inputs. This paper introduces Item-ID + Oral-language Mixture-of-Experts Language Model (IDIOMoE), which treats item interaction histories as a native dialect within the language space, enabling collaborative signals to be understood in the same way as natural language. By splitting the Feed Forward Network of each block of a pretrained LLM into a separate text expert and an item expert with token-type gating, our method avoids destructive interference between text and catalog modalities. IDIOMoE demonstrates strong recommendation performance across both public and proprietary datasets, while preserving the text understanding of the pretrained model.

Reza Shirkavand, Xiaokai Wei, Chen Wang, Zheng Hui, Heng Huang, Michelle Gong• 2025

Related benchmarks

Task	Dataset	Result
Sequential Recommendation	Amazon Beauty	NDCG@106.65	136
Sequential Recommendation	Amazon Toys	R@100.0927	58
Sequential Recommendation	Amazon Sports	NDCG@103.91	58
Sequential Recommendation	Amazon Instruments	NDCG@100.1054	34
Sequential Recommendation	Amazon Books	NDCG@102.24	33
Sequential Recommendation	Amazon Games	NDCG@106.05	18
Sequential Recommendation	Industrial Dataset	NDCG@1027.1	6

Showing 7 of 7 rows

Other info

Follow for update

@wizwand_team Discord