Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Self-Guiding Exploration for Combinatorial Problems

About

Large Language Models (LLMs) have become pivotal in addressing reasoning tasks across diverse domains, including arithmetic, commonsense, and symbolic reasoning. They utilize prompting techniques such as Exploration-of-Thought, Decomposition, and Refinement to effectively navigate and solve intricate tasks. Despite these advancements, the application of LLMs to Combinatorial Problems (CPs), known for their NP-hardness and critical roles in logistics and resource management remains underexplored. To address this gap, we introduce a novel prompting strategy: Self-Guiding Exploration (SGE), designed to enhance the performance of solving CPs. SGE operates autonomously, generating multiple thought trajectories for each CP task. It then breaks these trajectories down into actionable subtasks, executes them sequentially, and refines the results to ensure optimal outcomes. We present our research as the first to apply LLMs to a broad range of CPs and demonstrate that SGE outperforms existing prompting strategies by over 27.84% in CP optimization performance. Additionally, SGE achieves a 2.46% higher accuracy over the best existing results in other reasoning tasks (arithmetic, commonsense, and symbolic).

Zangir Iklassov, Yali Du, Farkhad Akimov, Martin Takac• 2024

Related benchmarks

TaskDatasetResultRank
Commonsense ReasoningARC Challenge
Accuracy93.28
243
Arithmetic ReasoningGSM8K (test)
Accuracy97.35
189
Commonsense ReasoningStrategyQA (test)
Accuracy83.49
119
Commonsense ReasoningCSQA (test)
Accuracy85.68
111
Arithmetic ReasoningSVAMP (test)
Accuracy98.16
70
Mathematical ReasoningASDiv (test)
Accuracy97.24
62
Arithmetic ReasoningAQuA (test)
Accuracy74.63
58
Permutation Flow Shop Scheduling ProblemPFSP
Optimality Gap0.0448
24
Maximum Independent SetMIS
Feasibility Rate94
21
Capacitated Vehicle Routing ProblemCVRP
Feasibility92
21
Showing 10 of 44 rows

Other info

Code

Follow for update