Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

PPE Preference

Benchmarks

Task NameDataset NameSOTA ResultTrend
Reward ModelingPPE Preference ZH
Accuracy82.3
19
Preference EvaluationPPE Preference (test)
Kuiper Statistic0.0434
8
Showing 2 of 2 rows