Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

PrefEval

Benchmarks

Task NameDataset NameSOTA ResultTrend
Preference-aligned Retrieval-Augmented GenerationPrefEval
Accuracy77.96
27
Personalization EvaluationPrefEval 10 injected adversarial turns
Pref Unaware Rate7.4
10
Preference evaluation via multi-choice queriesPrefEval Implicit
Accuracy69.9
8
Preference evaluation via multi-choice queriesPrefEval Explicit
Accuracy81.3
8
LLM Preference AlignmentPrefEval
AccPF68.8
7
Showing 5 of 5 rows