Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Pairwise human preference on Prolific user study large-scale (test)

97.33Winning Rate

PrefGen

77.923682.96188893.0382Dec 4, 2025
Updated 1mo ago

Evaluation Results

MethodLinks
2025.12
97.33
2025.12
90.67
2025.12
88.67
2025.12
88.67
2025.12
88
2025.12
78.67