Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

DyCoBench

Benchmarks

Task NameDataset NameSOTA ResultTrend
Pairwise Preference PredictionDyCoBench-1K Overall Preference
Preference Rate (A > B)78.3
17
Pairwise Preference PredictionDyCoBench-1K Multiple Criteria
Preference Rate (A > B)65.6
17
Pairwise Preference PredictionDyCoBench-1K Single Criterion
P(A > B)70.2
17
Showing 3 of 3 rows