Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Do LLMs Understand User Preferences? Evaluating LLMs On User Rating Prediction

About

Large Language Models (LLMs) have demonstrated exceptional capabilities in generalizing to new tasks in a zero-shot or few-shot manner. However, the extent to which LLMs can comprehend user preferences based on their previous behavior remains an emerging and still unclear research question. Traditionally, Collaborative Filtering (CF) has been the most effective method for these tasks, predominantly relying on the extensive volume of rating data. In contrast, LLMs typically demand considerably less data while maintaining an exhaustive world knowledge about each item, such as movies or products. In this paper, we conduct a thorough examination of both CF and LLMs within the classic task of user rating prediction, which involves predicting a user's rating for a candidate item based on their past ratings. We investigate various LLMs in different sizes, ranging from 250M to 540B parameters and evaluate their performance in zero-shot, few-shot, and fine-tuning scenarios. We conduct comprehensive analysis to compare between LLMs and strong CF methods, and find that zero-shot LLMs lag behind traditional recommender models that have the access to user interaction data, indicating the importance of user interaction data. However, through fine-tuning, LLMs achieve comparable or even better performance with only a small fraction of the training data, demonstrating their potential through data efficiency.

Wang-Cheng Kang, Jianmo Ni, Nikhil Mehta, Maheswaran Sathiamoorthy, Lichan Hong, Ed Chi, Derek Zhiyuan Cheng• 2023

Related benchmarks

TaskDatasetResultRank
Rating PredictionAmazon Musical Instruments 5-core (test)
MAE0.5411
15
Rating PredictionAmazon Instant Video 5-core (test)
MAE0.7425
15
Multimodal RecommendationAmazon Clothing Few-Shot (test)
HR (Top-5)0.1433
12
Multimodal RecommendationAmazon Clothing Zero-Shot (test)
HR @ 514.36
12
Multimodal RecommendationAmazon Sports Few-Shot (test)
HR (Top-5)16.93
12
Multimodal RecommendationAmazon Toys Few-Shot (test)
HR (Top-5)0.1447
12
Multimodal RecommendationAmazon Toys Zero-Shot (test)
HR@514.26
12
Multimodal RecommendationAmazon Sports Zero-Shot (test)
HR @50.1705
12
Rating PredictionAmazon 5-core Office Products
MAE0.6455
8
Showing 9 of 9 rows

Other info

Follow for update