Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Distribution Shift Alignment Helps LLMs Simulate Survey Response Distributions

About

Large language models (LLMs) offer a promising way to simulate human survey responses, potentially reducing the cost of large-scale data collection. However, existing zero-shot methods suffer from prompt sensitivity and low accuracy, while conventional fine-tuning approaches mostly fit the training set distributions and struggle to produce results more accurate than the training set itself, which deviates from the original goal of using LLMs to simulate survey responses. Building on this observation, we introduce Distribution Shift Alignment (DSA), a two-stage fine-tuning method that aligns both the output distributions and the distribution shifts across different backgrounds. By learning how these distributions change rather than fitting training data, DSA can provide results substantially closer to the true distribution than the training data. Empirically, DSA consistently outperforms other methods on five public survey datasets. We further conduct a comprehensive comparison covering accuracy, robustness, and data savings. DSA reduces the required real data by 53.48-69.12%, demonstrating its effectiveness and efficiency in survey simulation.

Ji Huang, Mengfei Li, Shuai Shao• 2025

Related benchmarks

TaskDatasetResultRank
Survey SimulationESS11
Weighted Distance (WD)0.104
16
Survey SimulationESS 9
Wasserstein Distance (WD)0.111
16
Survey SimulationCFPS
Weighted Distance (WD)0.095
16
Survey SimulationWVS
Weighted Distance (WD)0.086
16
Survey SimulationCGSS
Weighted Distance (WD)0.101
16
Human Response Distribution SimulationSurvey Datasets NewQ
Weighted Distance (WD)0.051
12
Human Response Distribution SimulationSurvey Datasets (NewP)
WD0.062
12
Human Response Distribution SimulationSurvey Datasets (Both split)
Weighted Distance (WD)0.069
12
Showing 8 of 8 rows

Other info

Follow for update