Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Horizon Generation

About

We explore how iterative revising a chain of thoughts with the help of information retrieval significantly improves large language models' reasoning and generation ability in long-horizon generation tasks, while hugely mitigating hallucination. In particular, the proposed method -- *retrieval-augmented thoughts* (RAT) -- revises each thought step one by one with retrieved information relevant to the task query, the current and the past thought steps, after the initial zero-shot CoT is generated. Applying RAT to GPT-3.5, GPT-4, and CodeLLaMA-7b substantially improves their performances on various long-horizon generation tasks; on average of relatively increasing rating scores by 13.63% on code generation, 16.96% on mathematical reasoning, 19.2% on creative writing, and 42.78% on embodied task planning. The demo page can be found at https://craftjarvis.github.io/RAT

Zihao Wang, Anji Liu, Haowei Lin, Jiaqi Li, Xiaojian Ma, Yitao Liang• 2024

Related benchmarks

TaskDatasetResultRank
Question Answering2WikiMQA--
44
Question AnsweringBamboogle
Cover Exact Match58.4
18
Question AnsweringMuSiQue
Cover EM20.8
18
Question AnsweringHotpotQA
Cover EM48.9
18
Question AnsweringAmbigQA
Cover EM56.4
18
Question AnsweringNQ
Cover EM0.536
18
Aggregation Query ExecutionAGGBench
Mean NACE0.793
11
Aggregation Query ExecutionAGGBench-Core
NACE (mean)0.787
11
Question AnsweringSpiderQA (test)
NACE (mean)0.385
11
Showing 9 of 9 rows

Other info

Follow for update