Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Reddit TLDR

Benchmarks

Task NameDataset NameSOTA ResultTrend
Personalized Reward ModelingReddit TLDR 150 examples Overall
User-level Accuracy69.7
11
Personalized Reward ModelingReddit TLDR 150 examples Unseen
User-level Accuracy69.8
11
Personalized Reward ModelingReddit TLDR 150 examples Seen
User-level Accuracy69.7
11
Personalized Reward ModelingReddit TLDR 100 examples Overall
User-level Accuracy69.6
11
Personalized Reward ModelingReddit TLDR 100 examples Unseen
User-level Accuracy69.6
11
Personalized Reward ModelingReddit TLDR 100 examples Seen
User-level Accuracy69.6
11
Showing 6 of 6 rows