Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Adaptive Graph of Thoughts: Test-Time Adaptive Reasoning Unifying Chain, Tree, and Graph Structures

About

Large Language Models (LLMs) have demonstrated impressive reasoning capabilities, yet their performance is highly dependent on the prompting strategy and model scale. While reinforcement learning and fine-tuning have been deployed to boost reasoning, these approaches incur substantial computational and data overhead. In this work, we introduce Adaptive Graph of Thoughts (AGoT), a dynamic, graph-based inference framework that enhances LLM reasoning solely at test time. Rather than relying on fixed-step methods like Chain of Thought (CoT) or Tree of Thoughts (ToT), AGoT recursively decomposes complex queries into structured subproblems, forming an dynamic directed acyclic graph (DAG) of interdependent reasoning steps. By selectively expanding only those subproblems that require further analysis, AGoT unifies the strengths of chain, tree, and graph paradigms into a cohesive framework that allocates computation where it is most needed. We validate our approach on diverse benchmarks spanning multi-hop retrieval, scientific reasoning, and mathematical problem-solving, achieving up to 46.2% improvement on scientific reasoning tasks (GPQA) - comparable to gains achieved through computationally intensive reinforcement learning approaches and outperforming state-of-the-art iterative approaches. These results suggest that dynamic decomposition and structured recursion offer a scalable, cost-effective alternative to post-training modifications, paving the way for more robust, general-purpose reasoning in LLMs.

Tushar Pandey, Ara Ghukasyan, Oktay Goktas, Santosh Kumar Radha• 2025

Related benchmarks

TaskDatasetResultRank
Mathematical ReasoningGame of 24
Accuracy74
103
Multi-hop Question AnsweringMoreHopQA
Accuracy70
25
Multi-hop Question AnsweringHotpotQA
Accuracy72
15
Graduate-level Question AnsweringGPQA
Accuracy64.6
11
Multiple-Choice ReasoningGPQA (test)
Accuracy64.6
11
Explorative ReasoningGame of 24 (test)
Accuracy74
11
Open-ended Question AnsweringHybridQA (test)
Accuracy84
11
Question Answering over Tables and TextHybridQA
Accuracy84
11
Explorative ReasoningCrosswords Word-level (test)
Accuracy3.5
11
Open-ended Question AnsweringMoreHopQA (test)
Accuracy70
11
Showing 10 of 20 rows

Other info

Follow for update