Probabilistic Tree-of-thought Reasoning for Answering Knowledge-intensive Complex Questions

About

Large language models (LLMs) are capable of answering knowledge-intensive complex questions with chain-of-thought (CoT) reasoning. However, they tend to generate factually incorrect reasoning steps when the required knowledge is not available or up-to-date in models' parameters. Recent works turn to retrieving external knowledge to augment CoT reasoning. Despite being promising, these chain-based methods suffer from: 1) Negative retrieval. Unnecessary or incorrect retrieval may mislead the reasoning; 2) Limited sight. Lacking the ability to look backward or forward, a local error in one step will propagate along the chain. In this paper, we propose a novel approach: Probabilistic Tree-of-thought Reasoning (ProbTree). First, LLMs translate a complex question into a query tree, in which each non-root node denotes a sub-question of its parent node. Then, probabilistic reasoning is conducted over the tree, by solving questions from leaf to root considering the confidence of both question decomposing and answering. During reasoning, for leaf nodes, LLMs choose a more confident answer from Closed-book QA that employs parametric knowledge and Open-book QA that employs retrieved external knowledge, thus eliminating the negative retrieval problem. For non-leaf nodes, with the hierarchical structure, LLMs have broader sights and are able to globally reason with the information from child nodes, thus recovering from local errors. The experiments on three Complex QA datasets under the open-domain setting show that our approach outperforms SOTA methods significantly, demonstrating the effect of probabilistic tree-of-thought reasoning.

Shulin Cao, Jiajie Zhang, Jiaxin Shi, Xin Lv, Zijun Yao, Qi Tian, Juanzi Li, Lei Hou• 2023

Related benchmarks

Task	Dataset	Result
Multi-hop Question Answering	2WikiMultihopQA	EM64.3	559
Multi-hop Question Answering	HotpotQA	F1 Score60.4	294
Question Answering	Bamboogle	EM24.8	227
Multi-hop Question Answering	Multi-hop RAG	F162.5	77
Multi-hop Question Answering	MuSiQue	EM11	50
Question Answering	GPQA	EM37.3	34
Multi-hop Question Answering	HotpotQA	EM32.2	20
Question Answering	HotpotQA	EM36.4	20
Question Answering	KoBLEX	Token F128.06	18
Information Retrieval	KoBLEX	F1 Score23.03	14

Showing 10 of 10 rows

Other info

Follow for update

@wizwand_team Discord