LUMOS: Large User MOdels for User Behavior Prediction

About

User behavior prediction at scale remains a critical challenge for online B2C platforms. Traditional approaches rely heavily on task-specific models and domain-specific feature engineering. This is time-consuming, computationally expensive, and requires domain expertise and therefore, not scalable. We present LUMOS (Large User MOdel Series), a transformer-based architecture that eliminates task-specific models and manual feature engineering by learning multiple tasks jointly using only raw user activity data. LUMOS introduces a novel cross-attention mechanism that conditions predictions on future known events (e.g., holidays, sales, etc.), enabling the model to predict complex behavior patterns like "how will upcoming holidays affect user engagement?" The architecture also employs multi-modal tokenization, combining user activities, event context, and static user demographic attributes into rich representations processed through specialized embedding pathways. Through extensive experiments on a production dataset spanning 1.7 trillion user activity tokens from 250 million users, we demonstrate that LUMOS achieves superior performance compared to traditional task-specific models. Across 5 tasks with established baselines, we achieve an average improvement of 0.025 in ROC-AUC for binary classification tasks and 4.6\% reduction in MAPE for regression tasks. Online A/B testing validates these improvements translate to measurable business impact with a 3.15\% increase in Daily Active Users.

Dhruv Nigam, Naman Agarwal, Krishna Murthy, Susmit Saha• 2025

Related benchmarks

Task	Dataset	Result
Task 1	User Behavior Prediction	ROC-AUC0.977	3
Task 2	User Behavior Prediction	ROC-AUC72.2	3
Task 3	User Behavior Prediction	ROC-AUC0.442	3
Task 4	User Behavior Prediction	ROC AUC70.8	3
Task 5	User Behavior Prediction	ROC AUC97.5	3

Showing 5 of 5 rows

Other info

Follow for update

@wizwand_team Discord