Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Learning Transferable Latent User Preferences for Human-Aligned Decision Making

About

Large language models (LLMs) are increasingly used as reasoning modules in many applications. While they are efficient in certain tasks, LLMs often struggle to produce human-aligned solutions. Human-aligned decision making requires accounting for both explicitly stated goals and latent user preferences that shape how ambiguous situations should be resolved. Existing approaches to incorporating such preferences either rely on extensive and repeated user interactions or fail to generalize latent preferences across tasks and contexts, limiting their practical applicability. We consider a setting in which an LLM is used for high-level reasoning and is responsible for inferring latent user preferences from limited interactions, which guides downstream decision making. We introduce CLIPR (Conversational Learning for Inferring Preferences and Reasoning), a framework that learns actionable, transferable natural language rules that represent latent user preferences from minimal conversational input. These rules are iteratively refined through adaptive feedback and applied to both in-distribution and out-of-distribution ambiguous tasks across multiple environments. Evaluations on three datasets and a user study show that CLIPR consistently outperforms existing methods in improving alignment and reducing inference costs.

Alina Hyk, Sandhya Saisubramanian• 2026

Related benchmarks

TaskDatasetResultRank
Introspective PlanningKitchenAmbig (OOD)
Average Accuracy97.6
10
Preference-aligned decision makingAmbiK (test)
Accuracy84.6
10
Preference-aligned decision makingHousekeep (test)
Accuracy42.5
10
Preference-aligned decision makingMobile Manipulation (test)
Accuracy67.1
10
Introspective PlanningKitchenAmbig (In-Distribution)
Average Accuracy94.3
10
User-aligned task completionKitchenAmbig (In-Distribution)
Accuracy84
3
User-aligned task completionKitchenAmbig (OOD)
Accuracy87.3
3
Showing 7 of 7 rows

Other info

Follow for update