Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

ELI5: Long Form Question Answering

About

We introduce the first large-scale corpus for long-form question answering, a task requiring elaborate and in-depth answers to open-ended questions. The dataset comprises 270K threads from the Reddit forum ``Explain Like I'm Five'' (ELI5) where an online community provides answers to questions which are comprehensible by five year olds. Compared to existing datasets, ELI5 comprises diverse questions requiring multi-sentence answers. We provide a large set of web documents to help answer the question. Automatic and human evaluations show that an abstractive model trained with a multi-task objective outperforms conventional Seq2Seq, language modeling, as well as a strong extractive baseline. However, our best model is still far from human performance since raters prefer gold responses in over 86% of cases, leaving ample opportunity for future improvement.

Angela Fan, Yacine Jernite, Ethan Perez, David Grangier, Jason Weston, Michael Auli• 2019

Related benchmarks

TaskDatasetResultRank
Long-form Question AnsweringELI5 (test)
ROUGE-L23.1
54
Long-form Question AnsweringELI5
ROUGE-L14.35
27
Lead Paragraph GenerationWikiSum CommonCrawl static (test)
ROUGE-L34.4
8
Showing 3 of 3 rows

Other info

Follow for update