Token-Efficient Item Representation via Images for LLM Recommender Systems
About
Large Language Models (LLMs) have recently emerged as a powerful backbone for recommender systems. Existing LLM-based recommender systems take two different approaches for representing items in natural language, i.e., Attribute-based Representation and Description-based Representation. In this work, we aim to address the trade-off between efficiency and effectiveness that these two approaches encounter, when representing items consumed by users. Based on our interesting observation that there is a significant information overlap between images and descriptions associated with items, we propose a novel method, Item representation for LLM-based Recommender system (I-LLMRec). Our main idea is to leverage images as an alternative to lengthy textual descriptions for representing items, aiming at reducing token usage while preserving the rich semantic information of item descriptions. Through extensive experiments, we demonstrate that I-LLMRec outperforms existing methods in both efficiency and effectiveness by leveraging images. Moreover, a further appeal of I-LLMRec is its ability to reduce sensitivity to noise in descriptions, leading to more robust recommendations.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Sequential Recommendation | Amazon Arts (test) | NDCG@1051.91 | 50 | |
| Sequential Recommendation | Amazon Sports (test) | NDCG@100.4071 | 21 | |
| Sequential Recommendation | Amazon Grocery (test) | NDCG@539.56 | 10 | |
| Sequential Recommendation | Amazon Phone (test) | NDCG@539 | 10 | |
| Recommendation | Goodbooks (test) | Hit@548.7 | 4 | |
| Micro-video recommendation | MicroLens | Hit Rate@548.54 | 3 | |
| Sequential Recommendation | Amazon Automotive (test) | Hit Rate@545.15 | 3 | |
| Sequential Recommendation | Amazon Video (test) | Hit@564.13 | 3 | |
| Sequential Recommendation | H&M Fashion (test) | Hit Rate@553.21 | 3 |