Personalized LLM Decoding via Contrasting Personal Preference

About

As large language models (LLMs) are progressively deployed in various real-world applications, personalization of LLMs has become increasingly important. While various approaches to LLM personalization such as prompt-based and training-based methods have been actively explored, the development of effective decoding-time algorithms remains largely overlooked, despite their demonstrated potential. In this paper, we propose CoPe (Contrasting Personal Preference), a novel decoding-time approach applied after performing parameter-efficient fine-tuning (PEFT) on user-specific data. Our core idea is to leverage reward-guided decoding specifically for personalization by maximizing each user's implicit reward signal. We evaluate CoPe across five open-ended personalized text generation tasks. Our empirical results demonstrate that CoPe achieves strong performance, improving personalization by an average of 10.57% in ROUGE-L, without relying on external reward models or additional training procedures.

Hyungjune Bu, Chanjoo Jung, Minjae Kang, Jaehyung Kim• 2025

Related benchmarks

Task	Dataset	Result
Scholarly Title Generation	LaMP Scholarly Title Generation	ROUGE-10.519	21
Abstract generation	LaMP Abstract Gen.	ROUGE-139.2	9
News Headline	LaMP News Headline	ROUGE-1 Score20.5	9
Review Writing	LaMP Review Writing	ROUGE-133.5	9
Topic Writing	LaMP Topic Writing	ROUGE-10.281	9
Personalized Generation	LongLaMP (Pair A) - Review (test)	ROUGE-128.54	8
Personalized Generation	LongLaMP Pair A Writing (test)	ROUGE-128.17	8
Personalized Generation	LongLaMP (Pair A) - Abstract (test)	ROUGE-139.44	8

Showing 8 of 8 rows

Other info

Follow for update

@wizwand_team Discord