Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Preference Optimization

Benchmarks

Task NameDataset NameSOTA ResultTrend
Machine TranslationPreference Optimization Machine Translation
Reward0.25
2
SummarizationPreference Optimization Summarization
Reward0.3
2
Conversational AssistantPreference Optimization Conversational
Reward0.28
2
Showing 3 of 3 rows