Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Reddit TL;DR

Benchmarks

Task NameDataset NameSOTA ResultTrend
SummarizationReddit TL;DR (test)
Preference vs SFT (%)75.61
8
Reward ModelingReddit TL;DR 70-30 split (test)
Win-Rate53.221
3
Showing 2 of 2 rows