Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

VISPA: Pluralistic Alignment via Automatic Value Selection and Activation

About

As large language models are increasingly used in high-stakes domains, it is essential that their outputs reflect not average} human preference, rather range of varying perspectives. Achieving such pluralism, however, remains challenging. Existing approaches consider limited values or rely on prompt-level interventions, lacking value control and representation. To address this, we introduce VISPA, a training-free pluralistic alignment framework, that enables direct control over value expression by dynamic selection and internal model activation steering. Across extensive empirical studies spanning multiple models and evaluation settings, we show VISPA is performant across all pluralistic alignment modes in healthcare and beyond. Further analysis reveals VISPA is adaptable with different steering initiations, model, and/or values. These results suggest that pluralistic alignment can be achieved through internal activation mechanisms, offering a scalable path toward language models that serves all.

Shenyan Zheng, Jiayou Zhong, Anudeex Shetty, Heng Ji, Preslav Nakov, Usman Naseem• 2026

Related benchmarks

TaskDatasetResultRank
Steerable Value AlignmentOpinionQA
Value Alignment Score68.47
42
Value AlignmentGLOBALOPINIONQA VITAL Distributional 6 values
JS Distance0.18
42
Value AlignmentVITAL Distributional MORALCHOICE
JS Distance0.132
42
Value AlignmentValueKaleidoscope VITAL Steerable setting
Value Alignment Score68.7
42
Pluralistic AlignmentVITAL Overton
Value Coverage43.76
35
Distributional AlignmentModPlural (test)
JS Distance0.23
3
Overton AlignmentModPlural (test)
Value Coverage30.34
3
Steerable AlignmentModPlural (test)
Accuracy37.34
3
Showing 8 of 8 rows

Other info

Follow for update