Tuning Language Models for Robust Prediction of Diverse User Behaviors
About
Predicting user behavior is essential for intelligent assistant services, yet deep learning models often struggle to capture long-tailed behaviors. Large language models (LLMs), with their pretraining on vast corpora containing rich behavioral knowledge, offer promise. However, existing fine-tuning approaches tend to overfit to frequent ``anchor'' behaviors, reducing their ability to predict less common ``tail'' behaviors. In this paper, we introduce BehaviorLM, a progressive fine-tuning approach that addresses this issue. In the first stage, LLMs are fine-tuned on anchor behaviors while preserving general behavioral knowledge. In the second stage, fine-tuning uses a balanced subset of all behaviors based on sample difficulty to improve tail behavior predictions without sacrificing anchor performance. Experimental results on two real-world datasets demonstrate that BehaviorLM robustly predicts both anchor and tail behaviors and effectively leverages LLM behavioral knowledge to master tail behavior prediction with few-shot examples.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Mobility Prediction | Foursquare NYC | Weighted Precision35.08 | 12 | |
| User Behavior Prediction | App Usage Dataset | Weighted Precision64.4 | 12 | |
| User Behavior Prediction | Honor Behavior Dataset | Weighted Precision63.17 | 12 |