Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

PE-FD

Benchmarks

Task NameDataset NameSOTA ResultTrend
Preference EstimationPE-FD=6 (test)
Average Outcome Reward56.91
9
Preference EstimationPE-FD=8 (test)
Average Outcome Reward61.28
9
Showing 2 of 2 rows