Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Reddit TIFU

Benchmarks

Task NameDataset NameSOTA ResultTrend
SummarizationReddit TIFU
ROUGE-115.81
10
SummarizationReddit TIFU (test)
ROUGE-20.116
7
Discrimination between Good Faith and Problematic agents (Summarization)Reddit TIFU 16.1:1
Cohen's d7.23
6
Abstractive SummarizationReddit TIFU 42k samples (test)
ROUGE-126.63
5
Faithfulness discriminationReddit TIFU
AUC77.2
4
SummarizationReddit TIFU Long (test)
ROUGE-130.31
4
SummarizationReddit TIFU (evaluation)
ROUGE-130.3
3
Abstractive SummarizationReddit TIFU
ROUGE-127.99
1
Showing 8 of 8 rows