Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Personalized LLM Decoding via Contrasting Personal Preference

About

As large language models (LLMs) are progressively deployed in various real-world applications, personalization of LLMs has become increasingly important. While various approaches to LLM personalization such as prompt-based and training-based methods have been actively explored, the development of effective decoding-time algorithms remains largely overlooked, despite their demonstrated potential. In this paper, we propose CoPe (Contrasting Personal Preference), a novel decoding-time approach applied after performing parameter-efficient fine-tuning (PEFT) on user-specific data. Our core idea is to leverage reward-guided decoding specifically for personalization by maximizing each user's implicit reward signal. We evaluate CoPe across five open-ended personalized text generation tasks. Our empirical results demonstrate that CoPe achieves strong performance, improving personalization by an average of 10.57% in ROUGE-L, without relying on external reward models or additional training procedures.

Hyungjune Bu, Chanjoo Jung, Minjae Kang, Jaehyung Kim• 2025

Related benchmarks

TaskDatasetResultRank
Personalized GenerationLongLaMP (Pair A) - Review (test)
ROUGE-128.54
8
Personalized GenerationLongLaMP Pair A Writing (test)
ROUGE-128.17
8
Personalized GenerationLongLaMP (Pair A) - Abstract (test)
ROUGE-139.44
8
Showing 3 of 3 rows

Other info

Follow for update