VISPA: Pluralistic Alignment via Automatic Value Selection and Activation

About

As large language models are increasingly used in high-stakes domains, it is essential that their outputs reflect not average} human preference, rather range of varying perspectives. Achieving such pluralism, however, remains challenging. Existing approaches consider limited values or rely on prompt-level interventions, lacking value control and representation. To address this, we introduce VISPA, a training-free pluralistic alignment framework, that enables direct control over value expression by dynamic selection and internal model activation steering. Across extensive empirical studies spanning multiple models and evaluation settings, we show VISPA is performant across all pluralistic alignment modes in healthcare and beyond. Further analysis reveals VISPA is adaptable with different steering initiations, model, and/or values. These results suggest that pluralistic alignment can be achieved through internal activation mechanisms, offering a scalable path toward language models that serves all.

Shenyan Zheng, Jiayou Zhong, Anudeex Shetty, Heng Ji, Preslav Nakov, Usman Naseem• 2026

Related benchmarks

Task	Dataset	Result
Steerable Value Alignment	OpinionQA	Value Alignment Score68.47	42
Value Alignment	GLOBALOPINIONQA VITAL Distributional 6 values	JS Distance0.18	42
Value Alignment	VITAL Distributional MORALCHOICE	JS Distance0.132	42
Value Alignment	ValueKaleidoscope VITAL Steerable setting	Value Alignment Score68.7	42
Pluralistic Alignment	VITAL Overton	Value Coverage43.76	35
Distributional Alignment	ModPlural (test)	JS Distance0.23	3
Overton Alignment	ModPlural (test)	Value Coverage30.34	3
Steerable Alignment	ModPlural (test)	Accuracy37.34	3

Showing 8 of 8 rows

Other info

Follow for update

@wizwand_team Discord