RA-ISF: Learning to Answer and Understand from Retrieval Augmentation via Iterative Self-Feedback
About
Large language models (LLMs) demonstrate exceptional performance in numerous tasks but still heavily rely on knowledge stored in their parameters. Moreover, updating this knowledge incurs high training costs. Retrieval-augmented generation (RAG) methods address this issue by integrating external knowledge. The model can answer questions it couldn't previously by retrieving knowledge relevant to the query. This approach improves performance in certain scenarios for specific tasks. However, if irrelevant texts are retrieved, it may impair model performance. In this paper, we propose Retrieval Augmented Iterative Self-Feedback (RA-ISF), a framework that iteratively decomposes tasks and processes them in three submodules to enhance the model's problem-solving capabilities. Experiments show that our method outperforms existing benchmarks, performing well on models like GPT3.5, Llama2, significantly enhancing factual reasoning capabilities and reducing hallucinations.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Multi-hop Question Answering | 2WikiMultihopQA | EM36.1 | 278 | |
| Multi-hop Question Answering | HotpotQA | -- | 221 | |
| Multi-hop Question Answering | MuSiQue | EM10.6 | 106 | |
| Open-domain Question Answering | TriviaQA | EM76.1 | 62 | |
| Single-hop Question Answering | TriviaQA | EM49.2 | 62 | |
| Single-hop Question Answering | PopQA | EM29.2 | 55 | |
| Multi-hop Question Answering | Bamboogle | EM28.9 | 37 | |
| Question Answering | StrategyQA | EM75.9 | 35 | |
| Open-domain Question Answering | NQ (Natural Questions) | EM40.2 | 33 | |
| Question Answering | Average (NQ, TriviaQA, HotpotQA, StrategyQA, 2WikiMHQA) | Average Score55 | 14 |