AlignUSER: Human-Aligned LLM Agents via World Models for Recommender System Evaluation

About

Evaluating recommender systems remains challenging due to the gap between offline metrics and real user behavior, as well as the scarcity of interaction data. Recent work explores large language model (LLM) agents as synthetic users, yet they typically rely on few-shot prompting, which yields a shallow understanding of the environment and limits their ability to faithfully reproduce user actions. We introduce AlignUSER, a framework that learns world-model-driven agents from human interactions. Given rollout sequences of actions and states, we formalize world modeling as a next state prediction task that helps the agent internalize the environment. To align actions with human personas, we generate counterfactual trajectories around demonstrations and prompt the LLM to compare its decisions with human choices, identify suboptimal actions, and extract lessons. The learned policy is then used to drive agent interactions with the recommender system. We evaluate AlignUSER across multiple datasets and demonstrate closer alignment with genuine humans than prior work, both at the micro and macro levels.

Nicolas Bougie, Gian Maria Marconi, Tony Yip, Narimasa Watanabe• 2026

Related benchmarks

Task	Dataset	Result
Next Action Prediction	OPeRA (test)	Action Generation Acc52.92	31
Rating Prediction	AmazonBook	MAE0.3741	27
Rating Prediction	MovieLens	RMSE0.4292	18
Binary Classification	MovieLens	Accuracy83.17	15
Binary Classification	AmazonBook	Accuracy85.46	15
Binary Classification	Steam	Accuracy82.69	15
Rating Prediction	Steam	RMSE0.497	15
Reasoning and Persona Consistency	OPeRA (test)	Pages per Session5.1	7

Showing 8 of 8 rows

Other info

Follow for update

@wizwand_team Discord