Forest-of-Thought: Scaling Test-Time Compute for Enhancing LLM Reasoning

About

Large Language Models (LLMs) have demonstrated remarkable abilities across various language tasks, but solving complex reasoning problems remains a significant challenge. While existing methods, such as Chain-of-Thought (CoT) and Tree-of-Thought (ToT), enhance reasoning by decomposing problems or structuring prompts, they typically perform a single pass of reasoning and may fail to revisit flawed paths, compromising accuracy. To address this limitation, we propose a novel reasoning framework called Forest-of-Thought (FoT), which integrates multiple reasoning trees to leverage collective decision-making for solving complex logical problems. FoT employs sparse activation strategies to select the most relevant reasoning paths, improving both efficiency and accuracy. Additionally, we introduce a dynamic self-correction strategy that enables real-time error correction, along with consensus-guided decision-making strategies to optimize both correctness and computational resources. Experimental results demonstrate that the FoT framework, combined with these strategies, significantly enhances the reasoning capabilities of LLMs, enabling them to solve complex tasks with greater precision and efficiency. Code will be available at https://github.com/iamhankai/Forest-of-Thought.

Zhenni Bi, Kai Han, Chuanjian Liu, Yehui Tang, Yunhe Wang• 2024

Related benchmarks

Task	Dataset	Result
Mathematical Reasoning	GSM8K (test)	Accuracy94	954
Reasoning	BBH	Accuracy82.4	726
Arithmetic Reasoning	GSM8K	Accuracy94.2	272
Logical reasoning	BBH	Accuracy82.6	249
Mathematical Reasoning	OlympiadBench	Accuracy12.6	213
General Reasoning	BBH	Accuracy83.5	190
Grade School Math Reasoning	GSM8K	Accuracy (GSM8K)94.2	138
Long-context Reasoning	LongBench	Accuracy (LongBench)61.5	101
Language Understanding	MMLU CF	Score72.4	66
General Knowledge Reasoning	MMLU CF	Accuracy70.6	64

Showing 10 of 37 rows

Other info

Follow for update

@wizwand_team Discord