CLIPer: Tailoring Diverse User Preference via Classifier-Guided Inference-Time Personalization
About
Personalized LLMs can significantly enhance user experiences by tailoring responses to preferences such as helpfulness, conciseness, and humor. However, fine-tuning models to address all possible combinations of user preferences is computationally expensive and impractical. In this paper, we introduce \textbf{CLIPer}(\textbf{Cl}assifier-guided \textbf{I}nference-time \textbf{Per}sonalization), a lightweight personalization approach that leverages a classifier model to steer LLM generation dynamically to different user preferences at inference time. Our method eliminates the need for extensive fine-tuning, inducing negligible additional computational overhead while enabling more controllable and nuanced personalization across single and multi-dimensional preferences. Comprehensive empirical analyses demonstrate the scalability and effectiveness of our approach in delivering personalized language generation.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Preference Alignment | UltraFeedback | Win Rate81 | 16 | |
| Preference Alignment | Koala | Wins (Count)196 | 14 | |
| Response Preference Evaluation | UltraFeedback (test) | Win Rate83.42 | 9 | |
| Personalized LLM response generation | Koala (test) | Win Rate (Reward Model)88 | 3 | |
| Preference Alignment | Koala | Win Rate (Reward Model)70.63 | 3 | |
| Preference Alignment | UltraFeedback | Win Rate (RM Evaluator)65.5 | 3 |