Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

L-Wilder

Benchmarks

Task NameDataset NameSOTA ResultTrend
Human PreferencesL-Wilder small
Preference Score85.9
14
Pointwise ScoringL-Wilder pointwise
Kendall's Tau0.994
9
Showing 2 of 2 rows