Read before Generate! Faithful Long Form Question Answering with Machine Reading
About
Long-form question answering (LFQA) aims to generate a paragraph-length answer for a given question. While current work on LFQA using large pre-trained model for generation are effective at producing fluent and somewhat relevant content, one primary challenge lies in how to generate a faithful answer that has less hallucinated content. We propose a new end-to-end framework that jointly models answer generation and machine reading. The key idea is to augment the generation model with fine-grained, answer-related salient information which can be viewed as an emphasis on faithful facts. State-of-the-art results on two LFQA datasets, ELI5 and MS MARCO, demonstrate the effectiveness of our method, in comparison with strong baselines on automatic and human evaluation metrics. A detailed analysis further proves the competency of our methods in generating fluent, relevant, and more faithful answers.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Question Answering | Natural Questions (test) | -- | 72 | |
| Long-form Question Answering | ELI5 (test) | ROUGE-L27.13 | 54 | |
| Question Answering | HotpotQA (test) | -- | 37 | |
| Human Evaluation | MS-MARCO (test) | Preference: FiD18 | 3 | |
| Long-form Question Answering | MS MARCO (evaluation) | Fluency2.7 | 2 |