EXACT: Explicit Attribute-Guided Decoding-Time Personalization

About

Achieving personalized alignment requires adapting large language models to each user's evolving context. While decoding-time personalization offers a scalable alternative to training-time methods, existing methods largely rely on implicit, less interpretable preference representations and impose a rigid, context-agnostic user representation, failing to account for how preferences shift across prompts. We introduce EXACT, a new decoding-time personalization that aligns generation with limited pairwise preference feedback using a predefined set of interpretable attributes. EXACT first identifies user-specific attribute subsets by maximizing the likelihood of preferred responses in the offline stage. Then, for online inference, EXACT retrieves the most semantically relevant attributes for an incoming prompt and injects them into the context to steer generation. We establish theoretical approximation guarantees for the proposed algorithm under mild assumptions, and provably show that our similarity-based retrieval mechanism effectively mitigates contextual preference shifts, adapting to disparate tasks without pooling conflicting preferences. Extensive experiments on human-annotated preference datasets demonstrate that EXACT consistently outperforms strong baselines, including preference modeling accuracy and personalized generation quality.

Xin Yu, Hanwen Xing, Lingzhou Xue• 2026

Related benchmarks

Task	Dataset	Result
Preference Prediction	PRISM (test)	Accuracy66.62	51
Personalized Preference Modeling	Summarize from Human Feedback (test)	Mean Accuracy66.01	12
Personalized Summarization	Summarize from Human Feedback (test)	Win Rate78.23	9

Showing 3 of 3 rows

Other info

Follow for update

@wizwand_team Discord