Share your thoughts, 1 month free Claude Pro on usSee more

Pluralistic Reward Model Learning on PRISM

59.6Accuracy

EpiPersona

Updated 3mo ago

Evaluation Results

Method	Links
EpiPersona 2026.03		59.6
EpiPersona 2026.03		58.54
VPL 2026.03		58.26
VPL 2026.03		58.23
BT 2026.03		57.11
PAL 2026.03		56.81
BT 2026.03		56.58
GPO 2026.03		56.48
GPO 2026.03		55.26
PAL 2026.03		54.23