Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Mixture-of-Personas Language Models for Population Simulation

About

Advances in Large Language Models (LLMs) paved the way for their emerging applications in various domains, such as human behavior simulations, where LLMs could augment human-generated data in social science research and machine learning model training. However, pretrained LLMs often fail to capture the behavioral diversity of target populations due to the inherent variability across individuals and groups. To address this, we propose \textit{Mixture of Personas} (MoP), a \textit{probabilistic} prompting method that aligns the LLM responses with the target population. MoP is a contextual mixture model, where each component is an LM agent characterized by a persona and an exemplar representing subpopulation behaviors. The persona and exemplar are randomly chosen according to the learned mixing weights to elicit diverse LLM responses during simulation. MoP is flexible, requires no model finetuning, and is transferable across base models. Experiments for synthetic data generation show that MoP outperforms competing methods in alignment and diversity metrics.

Ngoc Bui, Hieu Trung Nguyen, Shantanu Kumar, Julian Theodore, Weikang Qiu, Viet Anh Nguyen, Rex Ying• 2025

Related benchmarks

TaskDatasetResultRank
Sentiment ClassificationSST2 (test)--
214
Sentiment ClassificationIMDB (test)--
144
Topic ClassificationAG News (test)--
98
Sentiment ClassificationYelp (test)--
46
Synthetic Data GenerationAGNews (test)
FID0.951
7
Synthetic Data GenerationYelp (test)
FID0.948
7
Synthetic Data GenerationSST-2 (test)
FID1.131
7
Synthetic Data GenerationIMDB (test)
FID0.771
7
Showing 8 of 8 rows

Other info

Code

Follow for update