Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

LaMP: When Large Language Models Meet Personalization

About

This paper highlights the importance of personalization in large language models and introduces the LaMP benchmark -- a novel benchmark for training and evaluating language models for producing personalized outputs. LaMP offers a comprehensive evaluation framework with diverse language tasks and multiple entries for each user profile. It consists of seven personalized tasks, spanning three text classification and four text generation tasks. We additionally propose two retrieval augmentation approaches that retrieve personal items from each user profile for personalizing language model outputs. To this aim, we study various retrieval models, including term matching, semantic matching, and time-aware methods. Extensive experiments on LaMP for zero-shot and fine-tuned language models demonstrate the efficacy of the proposed retrieval augmentation approach and highlight the impact of personalization in various natural language tasks.

Alireza Salemi, Sheshera Mysore, Michael Bendersky, Hamed Zamani• 2023

Related benchmarks

TaskDatasetResultRank
Personalized Reward ModelingPRISM Personalized
Accuracy54.17
44
Personalized Reward ModelingChatbot Arena Personalized
Accuracy58.15
42
Personalized Reward ModelingBESPOKE-Meta OOD
Binary Preference Accuracy57.75
18
Personalized Question AnsweringLaMP-QA
Accuracy (Arts & Entertainment)42.58
10
News Headline GenerationLaMP-4 1.0 (test)
ROUGE-10.188
8
Scholarly Title GenerationLaMP-5 1.0 (test)
ROUGE-10.483
8
Language Model PersonalizationLaMP few-shot personalization setting
LaMP-1 Accuracy45.6
8
Language Model PersonalizationLaMP standard (full-data)
LaMP-1 Score0.584
8
Citation IdentificationLaMP-1 Personalized Citation Identification 1.0 (test user-based separation)
Accuracy69.9
4
Citation IdentificationLaMP-1 1.0 (test)
Accuracy63.6
4
Showing 10 of 18 rows

Other info

Code

Follow for update