Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Logic-LM: Empowering Large Language Models with Symbolic Solvers for Faithful Logical Reasoning

About

Large Language Models (LLMs) have shown human-like reasoning abilities but still struggle with complex logical problems. This paper introduces a novel framework, Logic-LM, which integrates LLMs with symbolic solvers to improve logical problem-solving. Our method first utilizes LLMs to translate a natural language problem into a symbolic formulation. Afterward, a deterministic symbolic solver performs inference on the formulated problem. We also introduce a self-refinement module, which utilizes the symbolic solver's error messages to revise symbolic formalizations. We demonstrate Logic-LM's effectiveness on five logical reasoning datasets: ProofWriter, PrOntoQA, FOLIO, LogicalDeduction, and AR-LSAT. On average, Logic-LM achieves a significant performance boost of 39.2% over using LLM alone with standard prompting and 18.4% over LLM with chain-of-thought prompting. Our findings suggest that Logic-LM, by combining LLMs with symbolic logic, offers a promising avenue for faithful logical reasoning. Code and data are publicly available at https://github.com/teacherpeterpan/Logic-LLM.

Liangming Pan, Alon Albalak, Xinyi Wang, William Yang Wang• 2023

Related benchmarks

TaskDatasetResultRank
Logical reasoningFOLIO
Accuracy71.6
119
Logical reasoningFOLIO (test)
Accuracy73.83
58
Logical reasoningProofWriter (test)
Accuracy84.35
36
Logical reasoningProntoQA (test)
Accuracy90.5
36
Logical reasoningProofWriter
Accuracy64.7
32
Logical reasoningAR-LSAT (test)
Accuracy62.1
24
Logical reasoningAR-LSAT
Accuracy31.2
24
Logical reasoningDeduction (test)
Accuracy99.5
20
Long-Context Stability & RetrievalLongBench
SR (Stability)42.8
6
Long-Context Stability & RetrievalInfBench
Stability Rate (SR)36.7
6
Showing 10 of 15 rows

Other info

Follow for update