Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Labels have Human Values: Value Calibration of Subjective Tasks

About

Building NLP systems for subjective tasks requires one to ensure their alignment to contrasting human values. We propose the MultiCalibrated Subjective Task Learner framework (MC-STL), which clusters annotations into identifiable human value clusters by three approaches (similarity of annotator rationales, expert-value taxonomies or rater's sociocultural descriptors) and calibrates predictions for each value cluster by learning cluster-specific embeddings. We demonstrate MC-STL on several subjective learning settings, including ordinal, binary, and preference learning predictions, and evaluate it on multiple datasets covering toxic chatbot conversations, offensive social media posts, and human preference alignment. The results show that MC-STL consistently outperforms the baselines that ignore the latent value structure of the annotations, delivering gains in discrimination, value-specific calibration, and disagreement-aware metrics.

Mohammed Fayiz Parappan, Ricardo Henao• 2026

Related benchmarks

TaskDatasetResultRank
Binary ClassificationVP+Schwartz Binary (test)
Overall AUC0.8
3
Preference LearningAnthropic HH-RLHF+VI Preference (test)
Overall Accuracy64
3
Rater Rationale ClusteringVP-Duty Binary (test)
Overall AUC0.76
3
Rater Rationale ClusteringVP-Right Ordinal (test)
Overall AUC75
3
Rater Rationale ClusteringVP-Duty Ordinal (test)
Overall AUC70
3
Rater Rationale ClusteringVP-Right Binary (test)
Overall AUC81
3
Rater Rationale ClusteringVP-Value Binary (test)
Overall AUC80
3
Rater Rationale ClusteringVP-Value Ordinal (test)
Overall AUC70
3
Sociocultural clusteringDICES-990 Ordinal
Overall AUC75
3
Sociocultural clusteringD3 Binary
AUC (Overall)0.63
3
Showing 10 of 10 rows

Other info

Follow for update