Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

PICACO: Pluralistic In-Context Value Alignment of LLMs via Total Correlation Optimization

About

In-Context Learning has shown great potential for aligning Large Language Models (LLMs) with human values, helping reduce harmful outputs and accommodate diverse preferences without costly post-training, known as In-Context Alignment (ICA). However, LLMs' comprehension of input prompts remains agnostic, limiting ICA's ability to address value tensions--human values are inherently pluralistic, often imposing conflicting demands, e.g., stimulation vs. tradition. Current ICA methods therefore face the Instruction Bottleneck challenge, where LLMs struggle to reconcile multiple intended values within a single prompt, leading to incomplete or biased alignment. To address this, we propose PICACO, a novel pluralistic ICA method. Without fine-tuning, PICACO optimizes a meta-instruction that incorporates multiple values to better elicit LLMs' understanding of them and improve alignment. This is achieved by maximizing the total correlation between specified values and LLM responses, which theoretically reinforces value conformity and reduces distractive noise, resulting in more effective instructions. Extensive experiments on five value sets show that PICACO works well with both black-box and open-source LLMs, outperforms several recent strong baselines, and achieves a better balance across up to 8 distinct values.

Han Jiang, Dongyao Zhu, Xiaoyuan Yi, Ziang Xiao, Zhihua Wei, Xing Xie• 2025

Related benchmarks

TaskDatasetResultRank
In-Context Value AlignmentValue Composition (Overall)
Confucianism Score3.788
37
Value AlignmentConfucianism-4
Conformity Score3.788
22
Value AlignmentHH Balance-8
Conformity Score4.317
17
Value AlignmentHarmlessness 4
Conformity Score4.305
16
Value AlignmentHelpfulness 4
Conformity Score4.342
16
Value AlignmentLiberalism 4
Conformity Score3.247
11
Human Evaluation of Value AlignmentValue Composition Human Study
Confucianism Score3.674
5
In-Context Value AlignmentVALUE PORTRAIT Liberalism-4 (OOD)
Conformity Score3.323
5
Showing 8 of 8 rows

Other info

Follow for update